CANADIAN 
JOURNAL OF MATHEMATICS 


S% 


Journal Canadien de Mathbndtigues 


VOL. XIII - NO. 1‘ i, %” 
1961 *% 


Foundations of the theory of dynamical systems of infinitely 
many degrees of freedom, II I. E. Segal 


Reciprocal convergence classes for Fourier series 
and integrals A. P. Guinand 


The analytic continuation of the Riemann-Liouville 
integral in the hyperbolic case Marcel Riesz 


Construction of primitives of generalized derivatives with 
applications to trigonometric series P. S. Bullen 


A note on non-negative matrices C. R. Putnam 
Summability methods on matrix spaces Josephine Mitchell 


A theorem on partially ordered sets, with applications to fixed 
point theorems Smbat Abian and Arthur B. Brown 


Arithmetic linear transformations and abstract prime 
number theorems S. A. Amitsur 


Block design games A. J. Hoffman and Moses Richardson 
Simple algebras of type (1, 1) are associative Erwin Kleinfeld 129 
The groups of regular complex polygons D. W. Crowe 149 
Decomposition of finite graphs into open chains 

C. St. J. A. Nash-Williams 157 


Homotopy and isotopy properties of 
topological spaces Sze-Tsen Hu 167 


Published for 
THE CANADIAN MATHEMATICAL CONGRESS 


by the 


University of Toronto Press 





EDITORIAL BOARD 


H. S. M. Coxeter, G. F. D. Duff, R. D. James, R. L. Jeffery, 
J..M. Maranda, G. de B. Robinson, P. Scherk 


with the co-operation of 


. B. DeLury, J. Dixmier, W. Fenchel, H. Freudenthal, I. Kaplansky, 
. 8. Mendelsohn, C. A. Rogers, H. Schwerdtfeger, A. W. Tucker, 
W. J. Webber, M. Wyman 


The chief languages of the Journal are English and French. 


Manuscripts for publication in the Journal should be sent to the 
Editor-in-Chief, G. F. D. Duff, University of Toronto. Authors are 
asked to write with a sense of perspective and as clearly as possible, 
especially in the introduction. Regarding typographical conventions, 
attention is drawn to the Author’s Manual of which a copy will be 
furnished on request. 


All other correspondence should be addressed to the Managing 
Editor, G. de B. Robinson, University of Toronto. 


The Journal is published quarterly. Subscriptions should be sent 
to the Managing Editor. The price per volume of four numbers 


is $10.00. This is reduced to $5.00 for individual members of recognized 
Mathematical Societies. 


The Canadian Mathematical Congress gratefully acknowledges the 
assistance of the following towards the cost of publishing this Journal: 


University of Alberta Assumption University 
University of British Columbia Carleton College 
Dalhousie University Ecole Polytechnique 
Université Laval Loyola College 
University of Manitoba McGill University 
McMaster University Université de Montréal 
Mount Allison University Nova Scotia Technical College 
Queen’s University St. Mary's University 
University of Saskatchewan University of Toronto 

National Research Council of Canada 

and the 
American Mathematical Society 


AUTHORIZED AS SECOND CLASS MAIL, POST OFFICE DEPARTMENT, OTTAWA 











FOUNDATIONS OF THE THEORY OF DYNAMICAL 
SYSTEMS OF INFINITELY MANY 
DEGREES OF FREEDOM, II 


I. E. SEGAL 


1. Introduction. The notion of quantum field remains at this time still 
rather elusive from a rigorous standpoint. In conventional physical theory 
such a field is defined in essentially the same way as in the original work of 
Heisenberg and Pauli (1) by a function ¢(x, y, z, #) on space-time whose values 
are operators. It was recognized very early, however, by Bohr and Rosenfeld 
(2) that, even in the case of a free field, no physical meaning could be attached 
to the values of the field at a particular point—only the suitably smoothed 
averages over finite space-time regions had such a meaning. This physical 
result has a mathematical counterpart in the impossibility of formulating 
$(x, y, z, t) as a bona fide operator for even the simplest fields (in any fashion 
satisfying the most elementary non-trivial theoretical desiderata), while on 
the other hand for suitable functions f, the integral [¢(x, y, z, t)f(x, y, 2, t) 
dxdydzdt could be so formulated. This mathematical development began 
with the work of Fock (3), in which the field was treated in the conventional 
way without smoothing, but which gave a concrete representation for a free 
field that was capable of extension to a representation by bona fide operators, 
of the smoothed field operators, in the non-relativistic case, an observation 
that formed the basis for the independent work of Friedrichs (4) and Cook (5). 
The latter gave in rigorous terms the basic mathematical theory of the situa- 
tion. Additional complications arise in giving an effective relativistic treatment, 
but it is now established that the suitably smoothed averages of the standard 
relativistic free real fields may be formulated as bona fide self-adjoint operators 
in Hilbert space in the strict mathematical sense (see below). 

There has not yet been an&logous progress for the case of interacting fields, 
and in the work of Wightman (6), for example, it has merely been postulated 
that field averages could be given meaning as operators. The expectation 
values of functions of these operators in the so-called physical vacuum state 
determine the observable consequences of the theory, and instead of attempting 
to specify the theory by partial differential equations one may rather attempt 


Received August 31, 1959. This research was supported in part by the Office of Scientific 
Research, and also at an earlier stage by the National Science Foundation, at which time the 
author enjoyed the hospitality of the Institutes of Mathematics and Theoretical Physics of 
he University of Copenhagen. 











2 I. E. SEGAL 


this through a more direct description of such vacuum expectation values. 
The efforts of Kallén and Wightman (7) have been directed towards a deter- 
mination of possible forms for the vacuum expectation values of simple 
products of smoothed field operators, and in the case of the triple product 
certain highly refined analytical information has been obtained from the 
postulates of Lorentz-invariance, microcausality (= “local commutativity’’), 
and positivity of the energy. The finiteness of the vacuum expectation values 
of such products may, however, be questioned, and in fact in conventional 
unrenormalized field theory they appear as infinite. In addition, even in the 
hypothetical case that these values are all finite, they do not necessarily fix 
the theory, that is, the vacuum expectation values of ail (smooth) functions 
of the field operators. 

The situation is in a way rather similar to, although vastly more complex 
than, that with regard to the specification of a probability distribution by its 
moments. The moments need not be finite; and even when they are finite they 
do not necessarily determine the distribution (cf. for example (8)). The 
argument that such expectation values must be finite because they have 
simple physical interpretations (9A) is quite parallel to the argument that 
the second moment of the distribution on the line with element of probabilicy 
a~'(a? + x*)-'dx must be finite because it measures the physical parameter a 
representing the dispersion of the distribution; speaking loosely in the manner 
of conventional physical theory, this second moment is easily seen to be 
proportional to a by an “‘infinite constant.”’ 

A rather natural way to attempt to remedy this situation is to pursue the 
analogue in the field-theoretic case for the characteristic function in the theory 


of probability distributions. This is always finite, is known to determine the § 


distribution, and moreover is capable of being characterized intrinsically. The 
present paper obtains such an analogue in connection with a general (non- 
pathological) state of a linear field. An interacting field on a particular space- 
like surface can be transformed into a linear field (by taking it in the so-called 
interaction representation, as described for example in (9B)), whereupon its 
vacuum state transforms into an (analytically rather inaccessible) state of 
the linear field. The present results thereby have implications for the vacuum 
state of an interacting field. It would in certain respects be more useful to be 
able to treat the interacting fields directly, but the mathematically ambiguous 
character of such fields at present seems to make it out of the question to give 
any rigorous treatment of the matter, and in addition there is the apparent 
lack of any formal characterization for the generating functions EletSo" 
(where EZ is the vacuum state expectation functional, ¢ is an interacting field, 
and f a general smoothing function) that would form the analogue for the 
“‘Heisenberg”’ fields of the present functional for a linear field. 

At any rate, we show here (rigorously) that a regular state of a linear field 
(of arbitrary unitary transformation properties) can be characterized by 6 
functional on the corresponding classical wave functions. That is to say, for 


aa Mi 











1es. 
ter- 
iple 
uct 
the 
r’), 
[ues 
ynal 
the 
- fix 


ions 


plex 
’ its 
hey 
The 
lave 
that 
ilicy 
er a 
nner 
» be 


- the 
eory 
» the 
The 
non- 
yace- 
alled 
n its 
te of 
‘uum 
‘0 be 
uous 

give 
rent 
et Son 
field, 
r the 


field 
by a 


*. for? 








STATES OF A LINEAR BOSON FIELD 3 


example, that any regular state of the quantized hermitian Klein-Gordon field 
satisfying the equation 
0 ¢ = m*o 

is determined by a functional defined on the manifold of all classical real- 
valued normalizable solutions of this equation. The generating functionals that 
arise in this way are characterized intrinsically, and it is shown how the state 
may be recovered from the functional. The attainment of these results requires 
the suitable grouping together of a fairly wide array of scientific developments, 
but the proofs do not involve any individual points of great technical difficulty. 

The present generating functional thus appears as considerably more 
economical, and mathematically distinctly more viable, than the character- 
ization of a state through the expectation values of products of field operators. 
It has, however, a rather less direct connection with conventional practice 
in so-called renormalization theory than the product approach. 

In the present paper only Bose-Einstein fields are treated, but the same 
methods can be adapted to the case of Fermi-Dirac fields. 


2. The general linear boson field. The conventional treatments of 
linear field theory start from specific sets of linear partial differential equations, 
and arrive at formal operator-valued functions satisfying the same partial 
differential equations and certain non-trivial commutation relations, after a 
procedure that varies somewhat from equation to equation. The treatment 
for the photon case is in particular rather parallel to that for the scalar meson 
case, but involves additional technical complications, which are somewhat 
space-consuming and significantly complicate the notation. In addition these 
treatments have no immediate extension to systems that may be defined not 
by partial differential equations in ordinary space-time, but in a more general 
space-time manifold; or which are covariant not with respect to the Lorentz 
group, but with respect to a more general one. A further difficulty is a funda- 
mental lack of uniqueness—for any given linear quantum field, there exist 
infinitely many others, satisfying the same commutation relations and partial 
differential equations, but no two of which are connected by a unitary trans- 
formation (cf. (10)). 

It is therefore relevant that there is available a perfectly general, rigorous, 
and quite mechanical procedure for linear quantization, whenever the states 
of the classical system being considered form a complex Hilbert space (or in 
fact, somewhat more generally). The commutation relations in particular are 
fixed once the structure of this Hilbert space is specified, and no further 
examination of the field equations is required. This unique mathematical 
structure may appropriately be called the ‘“‘general linear boson field’’; its 
use makes it possible to deal with the commutation relations for extensive 
classes of fields without the burden of complicated singular functions in the 
formalism, or the need to utilize generalized functions such as Schwartz’ 
distributions in order to rigorize parts of the analysis. The quantization of a 











4 I, E. SEGAL 


photon field is in particular, for example, reduced to the classical (that is, 
unquantized) problem of showing that the real normalizable solutions of 
Maxwell’s equations in a vacuum form a complex Hilbert space in a unique 
Lorentz-invariant manner. 

More generally, the normalizable classical solutions of a relativistic linear 
field equation form the complex Hilbert space § that is basic in the following 
for the treatment of the corresponding field. To present this development in 
the most elementary fashion, consider for example the real solutions of the 
Klein-Gordon equation 

0 o = m’*9. 

This is to be interpreted as a heuristic equation, for the relevant solutions are 
not necessarily conventional functions, but generalized ones. For a rigorous 
treatment it is simplest to take the Fourier transform © of @ as basic. In 
such terms § consists of all complex-valued @ on the hyperboloid k? = m? 
(here k is the vector with components (ko, k1, Re, 3) and k? denotes the Lorentz 
squared-length, k? = ko? — ki? — ko? — k3?), such that (— k) = &(k) 
(corresponding to the reality of the field) and with 


||? = [|S(k)|*%dx(k) < @, 


where dx(k) = |ko|~'dkidkedk;, and is characterized as the unique regular 
measure on the hyperboloid (within a constant factor) that is Lorentz-in- 
variant. The physical significance of the finiteness condition is more apparent 
if one deals, equivalently, with positive frequency rather than real solutions 
of the Klein-Gordor equation (cf. §3 of (11)), for these are conventionally 
interpretable as single-particle wave functions, and the normalizability 
corresponds to the existence in physical principle of an individual free particle 
(a non-normalizable wave function such as a plane wave having a somewhat 
ambiguous interpretation as a beam of particles). 

The description of the corresponding linear quantum field involves basically 
the formulation and labelling of the field observables, and in particular the 
specification of the commutation relations of the field variables. Convention- 
ally this is achieved in a heuristic fashion, by postulating the existence of an 
essentially unique (that is, unique within unitary equivalence, when irreduci- 
bility is present) operator-valued function ¢ such that 


0 ¢ = m’*o 


and the commutator [¢(x), o(x’)] = — iD(x — x’), where D is a certain 
singular function, and x is written in place of the 4-tuple (x, y, z, t). No such 
operator-valued function is known to exist, and actually there is practically 
conclusive evidence that it cannot exist in any literal sense; and in any event 
it could not be unique within unitary equivalence even if its range of values 
formed an irreducible set of operators. 

In order to deal in a mathematically clear and physically conservative way 
with such a matter, it is appropriate to make first a purely mathematical 








ng 





STATES OF A LINEAR BOSON FIELD 5 


construction and development; next to make a statement of what mathematical 
objects in this construction represent the observable or theoretically relevant 
physical objects associated with the physical system motivating the con- 
struction; and finally to establish the essential agreement between the resulting 
physical theory and the conventional one and/or experimental indications. 
The first part of this procedure should be mathematically rigorous; the second 
part should be precise, but necessarily only in the sense of legal rather than 
mathematical definitions; while the third part may well involve quite heuristic 
elements, and in fact this is necessarily the case when dealing with a theory 
whose conventional form is heuristic, as is quantum field theory. Accordingly 
we proceed as follows. 


Mathematical construction. There is no compelling reason not to use 
terminology here that is indicative of the physical object being considered, 
and a great deal of circumlocution may be avoided in this way. In particular, 
the term ‘“‘field’’ will be so used, but several different mathematical objects 
related to various heuristic types of fields must be distinguished. It will suffice 
here to deal with concrete, general, clothed, and zero-interaction linear boson 


fields. 


Definition 1. Let § be a real linear vector space, and let B be a given 
skew-symmetric bilinear form over . 

(a) A concrete LBF (LBF = linear boson field) over ($, B) is a map z— 
R(z) from to § the self-adjoint operators in a complex Hilbert space ® such 
that 


(x) @ *R(2) eiR(2!) = @ tR(2+2") 9 (B(z,2/) (z, 2’ arbitrary in ). 


Two such fields, R(.) on R and R’(.) on §’, over the same (, B), are 
unitarily equivalent in case there exists a unitary operator V from & onto §’ 
such-that 

VR(z)V-' = R’(z), for all ¢ in A. 


(b) A bounded field observable of a given concrete LBF is defined as a bounded 
operator on & that is a limit of a uniformly convergent sequence of operators, 
each of which is in the weakly closed ring of operators generated by the 
e*() for z ranging over some finite-dimensional (but otherwise arbitrary) 
linear subspace of §. The set of all bounded field observables is then a uni- 
formly closed self-adjoint algebra of operators on & (cf. (12), referred to 
henceforth as ‘‘I’’). 

(c) Two concrete LBF’s R(.) and R(.)’ on R and §’, over the same 
(©, B), are said to be physically equivalent in case there is a one-to-one corres- 
pondence between their respective bounded hermitian field observables 
preserving the operations of addition and squaring. A general LBF over 
(, B) is defined as a physical-equivalence class of concrete LBFs. We note 
that when § is a complex Hilbert space and B is the canonically associated 
skew form (cf. below), there is only one general LBF. 











6 I. E. SEGAL 


(d) A clothed LBF over (§, B) is a couple consisting of a general LBF 
over (§, B), together with a given state E of the (abstract) algebra of field 
observables. E is said to be regular in case its restriction to the weakly closed 
ring of operators generated by the e““, as z ranges over a finite-dimensional 
subspace of § on which B is non-degenerate, is weakly continuous relative 
to the unit sphere of this ring, for every such finite-dimensional subspace, and 
some concrete representative for the general LBF. An LBF is properly clothed 
if the associated state E is regular. 

(f) When a real linear vector space 9 has in addition a designated structure 
as a complex Hilbert space, compatible with its real-linear structure, the nota- 
tion ($, B) will be understood to refer to § as a real linear vector space, 
with B(z, 2’) = 4Jm[(z, 2’)]; and the notation § alone may refer to the 
couple ($, B), when it is clear from the context that it is this couple that is 
relevant. 

(g) As an example of a clothed LBF the zero-interaction LBF is defined as 
that clothed LBF over the complex Hilbert space § for which the given 
state E is invariant under the induced action of all unitary operators on § 
(cf. I, and especially Cor. 3.1 showing the uniqueness of the free LBF). 


Physical interpretation. lf ¢ is for example the conventional real Klein- 
Gordon field and f is any smooth function on space-time that vanishes at 
infinity, the field average So(x)f(x)dex is just such an R(z). The appropriate 5 
is just that defined above, consisting of all normalizable solutions of the 
Klein-Gordon equation; the appropriate z is that function on the mass hyper- 
boloid (that is, the manifold k? = m*) that coincides there with the complex 
conjugate of the Fourier transform of f; and the appropriate skew-symmetric 
form B(z, 2’) is just that defined equivalently as B(z, 2’) = }{{D(x — x’) 
f(x)f’ (x’)dxdx’, where 2’ is related to f’ in the same fashion as z to f, and D 
is the conventional singular function such that [¢(x), ¢(x’)] = —iD(x — x’), 
or, in a form in which the finiteness of B(z, 2’) is more apparent, 


Biz, 2’) = — asf sgn ko 2’(k)z(—k)dd(k) 


where M denotes the mass hyperboloid. The equation (x) is the bounded 
(Weyl) form of the infinitesimal relation 


[R(z), R(z’)] = — 2B(z, 2’), 


to which it is formally equivalent. 

Conventionally it was assumed, then, that the Klein-Gordon field is concrete 
and irreducible, and that this sufficed to define the field uniquely. The dis- 
covery that in actuality this was very far from being the case led to the 
introduction of general linear fields in I, in which the more sophisticated form 
of quantum phenomenology developed originally in (13) is employed. The 
notion of physical equivalence above seems at first glance insufficiently 
restrictive, but it is shown in (13) that it implies that the two systems have 





wn 


~ wo ono = et hlUcre |)h 6h} 





led 


te 
is- 
he 
rm 
he 
tly 
ve 





STATES OF A LINEAR BOSON FIELD 7 


corresponding pure states, corresponding observables have identical spectral 
values and probability distributions in given states, etc., so that the two 
systems are in fact in all observable respects the same. 

The distinguished state of a clothed LBF is physically the vacuum state, 
and a more conventional formulation may be obtained from the corres- 
pondence between states and representations of operator algebras, which leads 
to the result (cf. I) that for any properly clothed LBF there is a concrete LBF 
with a distinguished vector v, whose transforms under the e““ span the 
representation space §, and plays the role of the conventional physical vacuum 
state vector, in that the vacuum expectation value of the observable repre- 
sented by the operator A is (Av, v). Conversely, such a concrete LBF with 
distinguished vector v gives rise to a properly clothed LBF associated with it 
in the foregoing fashion; and the concrete LBF-with-vector is uniquely 
determined, within unitary equivalence, by the clothed LBF. 

The definition of regular state is rather technical, but is admissible from a 
purely physical point of view, since only those values of the state on the field 
observables are involved ; and is justified by the existence of various equivalent 
formulations. It is surely reasonable from an empirical-physical viewpoint to 
require that for a physical state FE, E[e‘“”“] be a continuous function of ¢, 
for any fixed vector z in §, and the regular states are precisely those that have 
this property, and in addition are determined in a natural way by the expecta- 
tion values E[e“] for all z. Alternatively, a regular state is one whose restric- 
tion to any subsystem of a finite number of degrees of freedom (that is, the 
ring of operators generated by the e“” as z ranges over a finite-dimensional 
subspace of §) is a normalizable (possibly mixed) state in essentially the 
conventional sense, that is, it has the form E(A) = tr (AD) for some operator 
D of absolutely convergent trace. It should be noted that the present notion 
of regularity is more stringent than that employed in I, which permitted a 
theoretical generality that is physically not entirely appropriate. In fact 
Corollary 3.1 of I is correct (at least in proof) only with the present notion of 
regular state, by virtue of the possible existence of a pathological state other 
than the zero-interaction vacuum, which would agree with the zero-interaction 
vacuum on all sufficiently smooth observables, in particular on all observables 
that are uniform limits of products of those of the form f(R(z)), for some z 
and continuous function f that vanishes at infinity, but not on the weak limits 
of such. A state that is not determined by its values on such observables could 
be fully determined only through the use of infinite fields; it could never be 
obtained as a limit of states in cut-off theories. The restriction to regular 
states thus amounts to a type of universally covariant cut-off; in place of it 
one could substantially limit the observables to those ‘‘smooth”’ ones obtainable 
in the fashion indicated, which it can reasonably be argued are the only ones 
that can actually be observed even conceptually. 

An interacting relativistic field on a particular space-like surface, in the 
interaction representation, gives a formal example of a linear boson field clothed 














8 I. E, SEGAL 


by the physical vacuum; in conventional theory, this clothing degenerates as 
the space-like surface recedes or advances into the infinite past or future, 
and the zero-interaction LBF is obtained. 


Formal equivalence of the present and the conventional formalisms. It must 
be shown how to define the conventional quantized field ¢(x); that this 
satisfies the relevant partial differential equation and also the canonical 
commutation relations. To this end let 6,’ denote the projection of the delta- 
function at the point x on the manifold of solutions of the relevant partial 
differential equation, for example, 6,’ is the reciprocal Fourier transform of the 
function in momentum space that agrees with e“* on the mass hyperboloid 
and vanishes outside the hyperboloid, in the case of the Klein-Gordon equation. 
Then set (x) = R(6,’); 5,’ is an improper element of , but this is inevitable 
since (x) is an improper operator. That ¢ as thus defined satisfies the given 
partial differential equations is a simple deduction from the Parseval formula 
for Fourier transforms. That the canonical commutation relations hold follows 
by substitution of 6,’ and 6,’ for z and 2’ in the relation [R(z), R(z’)] = 
— 2iB(z, 2’). 

It must also be shown that conversely, from such a conventional quantized 
field #, the present operators R(z) can be constructed. If f is any smooth 
function on space-time that vanishes at infinity, the operator fo(x)f (x)dax 
is defined as R(z), with z equal to the projection of f on the space of solutions 
of the relevant partial differential equation. For a non-scalar field this definition 
extends with the use of the Lorentz-invariant inner product in the finite- 
dimensional spin space for the field in question. Although the projection in 
question is singular as an operator in a Hilbert space, the z’s may be analytically 
well defined for appropriate f, and thereby the R(z) also. 


3. Characterization and uniqueness of the generating functional. 
For any state E of a general linear boson field over (§, B), the functional 
u(z) = Ele] is well defined, and may be called the generating functional 
for the state. From it, the expectation values of arbitrary products of the field 
(at distinct points) may be obtained by differentiation, at least heuristically, 
when such expectation values are finite. For foundational purposes what is 
essential is 


THEOREM 1. A (complex-valued) function yu on © is the generating functional 
of a regular state E of the general linear boson field over ($, B) with non-degenerate 
B if and only if the restrictions of u to arbitrary finite-dimensional subspaces of 
© are continuous, u(0) = 1, and 


(B(2j, 2k) 
> MCs —%)e~ "Ha, > 0 
j.a¢€ 


for arbitrary 2; in § and complex numbers a;, F being any finite index set. The 
functional p uniquely determines E. 





tog 
the 
the 


Th 


for 


is ; 
an 


co} 


fol 


In 
op 
su 
for 


Ne 
va 
su 








STATES OF A LINEAR BOSON FIELD q 


The “only if” part follows from the obvious fact that 


* 
(x ag'™) (x ap?) > 0, 
i j 


together with the relation (x). To prove the “‘if’’ part, let » be given satisfying 
the stated conditions. Put Ko for the set of all complex-valued functions on § 
that vanish except at a finite set of points, with the inner product 


(f, g) = ) f(2)9(2’)ulz ai» in 


(f and g in Ro). This inner product is a positive semi-definite hermitian form, 
so the set of all f in Ro with (f,f) = 0 forms a linear subspace &o’ of Ro, 
and the quotient o/c’ = &, (say) has canonically defined on it a strictly 
positive definite hermitian form: 


(f',2) = Geli f’ =f + Ro and g’ = g + Ry’. 
Now let Uo(z’), for 2’ in S, denote the transformation on Ro: 
f(z) —> ef" f(g — 2’). 
Then Uo(z’) is a linear operator with the inverse U»(—2’). Also, 
(Uo(2')f, Uolz)g) = VY, 8) 
for arbitrary f, g, and 2’. It follows that the map 
U,(2'): f’ — Uol2z’)f + Ro 


is a well-defined linear transformation on &,, that it has the inverse U;(— 2’), 
and that 
(Ui (2’)f’, Ui(2’)g’) = (f', 2’). 

Hence U;(z) extends uniquely to a unitary transformation U(z) on the 
completion R of &;. Now Uo(z)Uo(z’) = e*®*"Uo(z + 2’), from which it 
follows readily that 

U(z)U(2’) = ef U (ze + 2’). 
In particular, [U(tz): -e <t< ©] is a one-parameter group of unitary 
operators in &. This one-parameter group is continuous; to show this it 


suffices, by the density of &, in &, to show that (U(tz)f’, g’) is continuous 
for arbitrary f’ and g’ in &,. But 


(U(tz)f’, g’) -_ Zz (Bw Fy ak tz)9(u’)u(u a u’e(B* 


Now if f has the values o;,..., o, at %%,...,%, respectively, and g has the 
values 7;,..., 7, at these points, and both functions vanish elsewhere, this 
sum is 


1B(tz,0j7+t2) {Blog+tz.o%) - 
> wlvs + ts — vy) eB st otBieit im), 5 
jk 











10 1. E. SEGAL 


which represents a continuous function of ¢ by the assumption on yz. Thus 
the one-parameter group has a seli-adjoint generator R(z), and the R(z) 
satisfy the relations (x). 

If fo denotes the function defined by the equations f,(0) = 1, fo(z) = 0 
for z * 0, then (fo, fo’) = 1, and 


(U(2)fo', fo’) = u(z). 


Thus setting 
E(A) = (Afv’, fo’), 


E is regular and has characteristic functional x. 

To show that yu determines E uniquely, let E’ be an arbitrary regular state 
with characteristic functional 4. Then E’ is weakly continuous relative to the 
unit sphere of the ring on generated by the e“ for z in M, for all M on 


which B is non-degenerate, and some concrete LBF. If E is similarly weakly 
continuous, etc., relative to the same concrete LBF, then the unicity follows 
from the circumstance that the finite linear combinations of the e”™, z in M, 
form an algebra 1 mM whose weak closure is U.,; this implies that the unit 


; M 


sphere in mM is weakly dense in that of &%.,, according to a variant of an 
0» 


M 
argument due to von Neumann (14) (for full details cf. (15)). The assumed 
weak continuity of E and E’ on Aap relative to the unit sphere, together with 
mM’ then implies their equality. 


To conclude the proof it therefore suffices to show the 


their agreement on the unit sphere of 
0 


LemMA. [If a state E is weakly continuous on %.., relative to the unit sphere 


M 


for one concrete LBF, then the same is true for all concrete LBFs. 


To prove this, observe that as B is non-degenerate on Mt, co-ordinates may 
be chosen so that in Mt, z = (x1,...,%n) ® (y1,.-., 9%) (M being of dimen- 
sion 2m), and B has the form B(z, 2’) = Dox(xeye’ — xe’yx). The relations 
(x) then imply those on which von Neumann's proof of the uniqueness of the 
Schr¢dinger operators is based, so that by his result, op is, within multiplicity, 


and unitary equivalence, the conventional system of bounded observables in 
quantum mechanics for a particle with a 2m-dimensional phase space. That 
is to say, Aap is unitarily equivalent to an n-fold copy, for some finite or 


infinite cardinal number n, of the ring of operators generated by the exp(isq,) 
and the exp(itp,) (— ~ <s,t< ©;j,k = 1,2,..., n) in their action on 
the space L2(E,) of all complex-valued square-integrable functions on Euclid- 
ean n-space. All that is relevant here is that the weak topology on the unit 
sphere of an operator ring is easily seen to be independent of the multiplicity 
n of its representation. Now for any two concrete LBFs, the resulting Aon are 








bose 
foll< 
s, a 
C(z) 
eval 
Cor. 
unit 
toa 
to tl 


arbi! 
state 


I 
whil 


whic 


Of cx 
of th 





STATES OF A LINEAR BOSON FIELD ll 


within unitary equivalence multiples of the same ring of operators, and the 
lemma follows. 

This concludes the proof of the theorem, but it also follows and seems worth 
pointing out explicitly that we have the 


Corociary. Jf M is a linear subspace of S on which B is non-degenerate, 
then the ring Aon of all field observables based on M is a factor of type I, and 


the restriction of E to AX. has the form 


M 


E(X) = tr(XDy) 


for some operator Dy, of absolutely convergent trace relative to Lon: 

This is an immediate consequence of von Neumann's result used as above, 
together with the known form of the states of the ring of all bounded operators 
on a Hilbert space that are weakly continuous relative to the unit sphere. 
For this last result cf. for example (16, Theorem 14). 


4. Some examples. The generating functional of the zero-interaction linear 
boson field over a (complex) Hilbert space may be computed explicitly as 
follows. If C(z) denotes the creation operator for a particle with wave function 
z, as defined for example in (5), and R(z) denotes the closure of (C(z) + 
C(z)*)/+/2, while E denotes the zero-interaction vacuum state, then the 
evaluation of Ele“ ] reduces to the case when § is one-dimensional (see 
Cor. 3.6 of (17)). The representation of e“ in terms of the one-parameter 
unitary groups generated by the canonical p’s and q’s in one dimension leads 
to a familiar type of integral involving the normal distribution, and ultimately 
to the result 


u(z) = etl? 


for the zero-interaction vacuum generating functional. 
This may be used to obtain zero-interaction vacuum expectation values of 
arbitrary products of field values by noting that formally one has for any 
state E, 
” . oun it R(21) ttn R( tn) 
E[R(2;) ...R(z,)] = « "{6"/dt,... dt,}Ele ile 1| «- 


while 


= n= Oy 


tj teB(2;j, 2%) 


z 
eM R(e) ie it et Ble) - go Bi tisit...+ tated 5 —, 


which has zero-interaction vacuum expectation value 
exp[—4 >> tfe(z,, 2e)). 
gk 


Of course, in general the indicated derivative as well as the expectation value 
of the product of field operators will fail to exist. However, it is easy to justify 











12 I. E. SEGAL 


in the bare field case the foregoing formal equality, and thereby obtain explicit 
expressions for the bare vacuum expectation values of products of fields. In 
the simplest non-vanishing case, there results the formula 


E[R(2:)R(z2)] = 4 (21, 22) 


(also easily obtainable directly) having the conventional interpretation that 
the zero-interaction vacuum expectation value of the product of two field 
values is (1/2) the singular function providing the kernel for the symmetric 
form giving the Lorentz-invariant inner product, as is well known for special 
fields. 

Turning to general states, if all the regular states could readily be expressed 
in terms of zero-interaction field quantites, the use of the general LBF might 
be avoidable. Theoretically this would be rather extraordinary, in view of the 
general situation in quantum fields, and in fact quite explicit examples can 
be constructed to show that this is not the case. To show simply the existence 
of regular pure states that are not normalizable in the Fock-Cook representation, 
that is, not of the form E(A) = (Av, v) for some normalizable vector v, one 
may proceed as follows. Take the representation of the bare field in terms of 
the space L2(§,, ) of square-integrable functionals over a real subspace 9, 
of § such that § = §, + 1,, as given in (17). Let @ denote the auto- 
morphism of the algebra of field observables over § taking each canonical 
p into 2p and each canonical g into g/2 (this exists by Theorem 2 of I), and 
define E, as the transform of the bare vacuum state under the induced action 
of @, that is, Ee(A) = E(A*), where E denotes the bare vacuum state and 
A — A®* the action of 6. Then Ey is evidently regular and pure, but may be 
seen to be non-normalizable as follows. 

As a basis for an indirect argument, assume that E,(A) = (Au, xu), for 
some u in L2(,). Then for arbitrary x in §,, using the notation and Theorem 
3 of (17), with c = }, 


Ey(e'?®) -_ f. e*-"\u(y)|*dn(y), 
9, 


and since Es(e*?™) = E(e*?®/?), which is readily evaluated as 
e 1 8iz\? 


it suffices to show 


f. e*F(y)dn(y) # e417! 
Sy) 


Tr 


if Fis an arbitrary element of L,($,, 2). Such an Fis an L,-limit of polynomial 
functions, so it suffices to show that if G(.) is any polynomial and 


g(x) = feesGQ)an(y), 


then 









t 
1 
i 


al 








STATES OF A LINEAR BOSON FIELD 


o = sup,|g(x) — e7?!7!*| > 3, 


where 6 is > 0 and independent of G. But g(x) has the form 
g(x) = p(x)e tl, 
where ~ is a polynomial, based, say, on the finite-dimensional subspace 


M of H. Now if x is in the orthocomplement of M, p(x) = p(0), whence 


: tie? —1/8\2|) 
o > infe supocrc,, jae ?'*" — e~V8!#!*|_ 


Simple calculus leads from this to the bound ¢ > }. 


5. Special dynamics in terms of generating functionals. For the 
special but significant case of a Hamiltonian that is quadratic in the canonical 
variables, there is a remarkably simple formulation of the corresponding 
quantum dynamics. It gives the explicit time development of the generating 
functional, and hence of the system, in terms of the corresponding classical 
dynamics. 


If a classical motion with Hamiltonian quadratic in the canonical variables takes 
z— Vi2(— © <t< @;¢ = time) 


then the corresponding quantum-mechanical motion transforms generating 


functionals as follows: 


u(z) — w(V,2). 


To prove this, note that if the Hamiltonian is quadratic, then the motion 
in phase space § is linear. That is to say, for any fixed ¢, V, is a non-singular 
transformation preserving the fundamental skew form B(z, 2’) = }:(pi'a 
— pregr’), where z = (pi,..., Pn) ® (Gi,.--,n), where m may be finite, or 
there may be infinitely many degrees of freedom, in which case the appropriate 
modifications in the notation are obvious (cf. (18) for the infinitesimal situa- 
tion in a finite number of dimensions and (19) for the global situation in any 
number of dimensions). Now for any such transformation, the corresponding 
quantum-mechanical motion may be uniquely given by the condition that it 
transform R(z) into R(Tz) (cf. 1), so that it has precisely the stated effect 
on the generating functional. 


6. Restriction to regular observables instead of regular states. 
Instead of dealing with a restricted class of states of an extensive system of 
observables, one may contemplate dealing with all states of a restricted sub- 
system of observables. A general bounded self-adjoint operator is an observable 
only in a quite theoretical way; a class of operators slightly closer to actual 
measurements than those merely generated by the canonical variables would 
be those expressible explicitly in terms of them. It turns out that for systems 
of a finite number of degrees of freedom, which may be used to approximate 
infinite systems, there is a natural way to make this idea effective. 








14 I, E. SEGAL 


Definition 2. A regular observable over a finite-dimensional linear space 
with distinguished skew-symmetric form B, relative to a concrete LBF over 


(©, B), is one of the form 
fi ef (s)dz 
§ 


or uniformly approximable by such. Here f is integrable over §, while dz is 
the element of measure determined by B. 

The foregoing integral may be taken in the strong or weak operator topo- 
logies (in the sense of (20)). The resulting class of operators is unaffected by 
a change in the measure employed, within absolute continuity. It is not 
difficult to verify that the notion of regular observable is invariant under 
physical equivalence, and therefore may be applied to elements of the general 
LBF over a finite-dimensional space. 


THEOREM 2. The regular observables form a uniformly closed self-adjoint 
algebra, every state of which extends uniquely to a regular state of the general LBF 
over (, B), S being finite-dimensional; and every regular state arises in this way. 


That the regular observables form an algebra follows without difficulty 
from the relations (x), the general properties of the integrals involved, and 
the Fubini theorem. Now for any state E, 


|| fgetevceae |] < fg @lee, 


so that by the known form of the general continuous linear functional in L,, 
these exists a bounded measurable function 4 on § such that 


A [estes | = Jf uef@de. 


By the positivity of E, u satisfies the inequality 
fue — 2')e**-*f (2) (2')dz dz’ > 0. 


If « were continuous, this would imply that uz is a generating functional. Now 
it is well-known that a measurable integrally-positive-definite function on a 
locally compact group differs from a continuous positive definite function 
on the group on a null set (in the sense of Haar measure on the group). An 
argument similar to that involved in the proof of this result yields the corres- 
ponding result here, or this result may be derived from the theory of positive 
definite functions on groups; here we take the latter course, as this is illuminat- 
ing in certain additional respects. 

Let G be the group (cf. (18)) of all pairs (z, s) with z in § and s real, and 
the multiplication 


(z,s). (2, s’) = (¢ +2,5+ 5s’ + B(z,2’)). 








A) 


er 


Li, 


nd 





STATES OF A LINEAR BOSON FIELD 15 


Then G is a Lie group with the obvious manifold structure. Set A(z, s) 
= yu(z)e~"*; then A is a measurable and integrally positive definite function 
on G, as is readily verified. By the result cited, it differs on a null set from a 
continuous positive definite function. Now § may be identified with the subset 
of G consisting of all elements of the form (z, 0), the map z — (z, 0) being a 
homeomorphism into. It follows that yu differs on a null set in § from a con- 
tinuous function yp’. 

Now y’ will satisfy the positivity condition given in Theorem 1, as follows 
from the integral positivity condition by a simple approximation argument. 
Since (u’ (2, — z,)e*®-; j,k = 1,...,) is a positive semi-definite hermitian 
matrix for arbitrary 2;,...,2Z,, and yw’ cannot vanish identically since E # 0, «(0) 
must be positive. Setting u” = yu’ /u’ (0), wu” satisfies all the conditions of Theorem 
1, so there exists a regular state E’ of the general LBF, say U, over ($, B), whose 
generating functional is u’’. Now the unit sphere of the algebra ® of regular 
observables is weakly dense in that of &, by the result cited above, together 
with the easily established fact that the weak closure of ® is HM. In particular, 


E'(X*X) = sup E (X*X) 
x \<1 


sup 
xl ixi<i 


Rix 


which shows, since E’ is a state of H, that the right side of the foregoing equality 
has the value unity. On the other hand, 


BL fenorieds| ina (u'(0))-* fw! ef ds 


for arbitrary continuous f vanishing outside a compact set, as the integrals 
may then be taken in the Riemann sense; and by a simple approximation 
argument, it results that E’(X) = (u’(0))-'E(X) for arbitrary X in R. Now 
as E isa state of ®, 


sup 
xX, |X\i=1 


E(X*X) = 1, 
and it follows that w’(0) = 1. 

Thus there exists a regular state E’ of the full general LBF extending the 
given state E of the regular observables. That E’ is unique follows from the 
density of the unit sphere of R in that of Y%. Conversely, if E’ is a given regular 
state of YW, its restriction to ® is a positive linear functional which by the 
argument just given must have unit norm, and so be a state E. 

It should be remarked that the connection with the theory of positive definite 
functions can be misleading, if not utilized with care. For example, when 
B = 0, the condition of Theorem 1 becomes ordinary positive definiteness 
(apart from the normalization), but the theorem is then irremediably false, 
as it asserts essentially that an arbitrary positive definite function is the 
Fourier-Stieltjes transform of an absolutely continuous measure. The introduc- 
tion of a non-degenerate B thus has roughly the qualitative effect of eliminating 
the possible discontinuous and continuous but singular parts of the associated 
state. 














16 I. E. SEGAL 


7. Possible further developments. A number of problems emerge from 
the foregoing work. Some typical ones of interest are as follows. 

1. The zero-interaction vacuum is the only regular state of the general 
LBF over a complex Hilbert space § that is invariant under all unitary 
operators on §. Presumably it is, more cogently, the only such state invariant 
under a physically relevant representation of the Lorentz group, say, that 
associated with a relativistic particle of integral spin. For a particle of positive 
mass (=minimum proper value of the infinitesimal generator of translations 
it time), it is clear that there exist no normalizable states in the Fock-Cook 
representation other than the zero-interaction vacuum that are invariant, 
but it remains to be proved that this is the case for all regular states. In the 
vanishing mass case the situation is less clear, and correspondingly more 
interesting. 

2. It seems probable that any symplectic transformation in a complex 
Hilbert space § (that is, a real-linear transformation leaving invariant the 
imaginary part of the inner product) will effect a transformation of the zero- 
interaction vacuum of the general LBF over $ into a state that is not normaliz- 
able in the Fock-Cook representation, except when the transformation is 
unitary; this would generalize the example given above of such a state.* 

3. When § is finite-dimensional, yu(z) is essentially the Fourier transform 
of Wigner’s quasi-probability distribution ((21); cf. also (22) and the literature 
cited there, especially the paper by Moyal). Now a classical motion takes a 
z into a 2’, while a quantum-mechanical one takes a generating function u into 
another »’. The determination of the precise relation between y’(z) and 
u(z’), shown above to be identical in the case of a quadratic Hamiltonian, 
is connected with the problems of interpretation considered by Wigner and 
later authors. Although it seems fairly clear that no exact result is to be 
hoped for, even a simple approximate relation between the two functions 
might well be quite useful. 

4. The difficulty forming the basis of the preceding problem also suggests 
the more extensive question of the extent to which a theory similar to the 
present one, but covariant under the entire group of classical contact trans- 
formations rather than merely the symplectic group, can be set up. In such 
a theory the analogue to the smoothed field operators R(z) would perhaps 
be a function R(Z) defined for infinitesimal contact transformations Z, and 
satisfying in place of the commutation relations involved above, the relations 


[R(Z), R(Z’)] = R((Z, Z’}) + @ (Z, 2’), 


where & denotes the fundamental second-order differential form on (that is, 
the well-known form }°,.dp,dq, in the case of a finite number of degrees of 
freedom, while in the infinite case it is an analogous form determined by the 
field commutators). The main difficulty here is not the presence of the term 


*Remark added in proof. This result has been established in the meantime by David Shale. 





tm 








STATES OF A LINEAR BOSON FIELD 17 


R((Z, Z’}), which was absent above because Z and Z’ were essentially infini- 
tesimal translations in § and so had vanishing commutator, but rather the 
circumstance that the 2 (Z, Z’) are not constant numbers, but scalar functions 
on §, the multiplications by which naturally do not commute with the Z's. 

A possible way around this difficulty is the employment of a suitable 
analogue to the group G employed above, such as the group whose Lie algebra 
(that is, associated infinitesimal group) consists of all pairs (Z, f), where Z 
is an infinitesimal classical contact transformation and / is a function on phase 
space, with the commutation relations 


((Z,f), (2°, f°.) = (2, 2’), Zf’ — ZF +2 (Z, 2Z’)). 


That these define a Lie algebra (that is, notably that the Jacobi conditions 
hold) follows from the fact that @ is a closed form, using the expression for the 
derivative of a form in terms of the form itself, together with brackets of 
vector fields and the operations of the vector fields on values of the form 
(cf. (23), § 1). A pair such as (Z, f) may be interpreted as the generator of a 
contact transformation in the tangent bundle of the phase space §, a con- 
struction that has been suggested in another form and connection in (24) for 
the finite-dimensional case, and which leads to difficulties of interpretation 
as pointed out there. On the other hand, the linearity of § has ceased to play 
a role; the same construction can be made for any manifold § (endowed with 
a suitable form 2, determined in physics from the equations defining §). The 
approach therefore opens up a possible way of quantizing non-linear systems 
covariantly with respect to the group of all classical contact transformations. 


REFERENCES 


1. W. Heisenberg and W. Pauli, Quantum mechanics of wave fields, Zeits. {. Physik, 56 (1929), 
1-61. 

2. N. Bohr and L. Rosenfeld, Zur frage der Messbarkeit der Elektromagnetischen Feldgrossen, 
Kgl. Danske Vidensk. Selsk., mat.-fys. Medd. XII, 8 (1953), 3-65. 

3. V. Fock, Konfigurationsraum und sweiten Quantelung, Zeits. f. Physik, 75 (1952), 622-647. 

4. K. O. Friedrichs, Mathematical aspects of the quantum theory of fields, I-II, Commun. Pure 
Appl. Math., 4 (1951), 161-224. 

5. J. M. Cook, The mathematics of second quantization, Thesis, University of Chicago, 1951; 
in part in Trans. Amer. Math. Soc., 74 (1953), 222-245. 

6. A. S. Wightman, Quelques problémes mathématiques de la théorie quantique relativiste, in 
Colloque (Lille, 1957) on the mathematical problems of quantum field theory (Paris, 1959), 
1-38. 

7. G. Kallén and A. S. Wightman, The analytic properties of the vacuum expectation values of a 
product of three scalar fields, K. Danske Vidensk. Selsk., mat-fys. Skr., 1 (1958). 

8. J. A. Shohat and J. D. Tamarkin, The problem of moments (New York, 1943). 

9A. G. Kallén, Lectures at the Institute for Theoretical Physics, Lund, Spring, 1959. 

9B. J. Schwinger, Quantum electrodynamics. 1, A covariant formalism, Phys. Rev., 74 (1948), 
1439-1461. 

10. I. E. Segal, Distributions in Hilbert space and canonical systems of operators, Trans. Amer. 
Math. Soc., 88 (1958), 12-41. 











18. 





I. E. SEGAL 


. L. E. Segal, Direct formulation of causality requirements on the S-operator, Phys. Rev., 109 


(1958), 2191-2198. 
Foundations of the theory of dynamical systems of infinitely many degrees of freedom, I, 
Kgl. Danske Vidensk. Selsk., mat.-fys. Medd., 31 (1959), 1-38. 





. —— Postulates for general quantum mechanics, Ann. Math., 48 (1947), 930-948. 
. F. J. Murray and J. von Neumann, On rings of operators, IV, Ann. Math., 44 (1943), 


716-808. 


. I. Kaplansky, A theorem on rings of operators, Pac. J. Math., 1 (1951), 227-232. 


I. E. Segal, A non-commutative extension of abstract integration, Ann. Math., 57 (1953), 
401-457. 

Tensor algebras over Hilbert spaces, I, Trans. Amer. Math. Soc., 81 (1956), 106-134. 

L. van Hove, Sur certaines représentations unitaires d'un groupe infini de transformations, 
Acad. Roy. Belgique, Cl. Sci. Mém. Coll in 8°, 26 (1951), 102 pp. 





19. D. Shale, On certain groups of operators on Hilbert space, Doctoral Thesis, University of 


Chicago, 1959. 
. R. S. Phillips, Integration in a convex linear topological space, Trans. Amer. Math. Soc., 
47 (1940), 114-145. 


. E. Wigner, On the quantum correction for thermodynamic equilibrium, Phys. Rev., 40 (1932), 


749-759. 


. G. A. Baker, Jr., Formulation of quantum mechanics based on the quasi-probability dis- 


tribution induced in phase space, Phys. Rev., 109 (1958), 2198-2206. 


23. G. Hochschild, Cohomology of restricted Lie algebras, Amer. J. Math., 76 (1954), 555-580. 
24. L. H. Thomas, General relativity and particle dynamics, Phys. Rev., 112 (1958), 2129-2134, 


and to appear. 











RECIPROCAL CONVERGENCE CLASSES FOR 
FOURIER SERIES AND INTEGRALS 


A, P. GUINAND 


Introduction. The classical result of Plancherel for Fourier cosine trans- 
forms of functions f(x) of the class L*(0, ©) states that (see (7) for references) 


9 ) °T 
g(x) = Lim. (2 f(t) cos xt dt 
T-+0 0 


converges in mean square to a function g(x) which also belongs to L°(0, ~), 


and furthermore 
2\' 7 
f(x) = Lim. 2) f g(t) cos xt dt. 
T+ ® 0 


Some years ago in a series of papers (1; 2; 3) on summation formulae | 
showed that a similar symmetrical theory for narrower classes of functions 
and ordinary convergence of the integrals can also be developed. The relevant 
results can be expressed as follows: 


THEOREM 1. Jf f(x) is the integral of its derivative and xf'(x) belongs to 
L?(0, ©), then 
lim f(x) = 1 


exists, f(x) — 1 belongs to L*(0, ©), and 
f(x) —1 = o(x) 
as x tends to +0 or to +2. 
Definition 1. If f(x) is the integral of its derivative and x f’(x) belongs to 


L?(0, ~), and if the limit to which f(x) tends as x tends to infinity is zero, 
then we say that f(x) belongs to the class S,?(0, @). 


THEOREM 2. Jf f(x) belongs to S;*(0, ~) then for x > 0 


4 p30 
(1) g(x) = (2) f(t) cos xt dt 
T 0 
converges, g(x) also belongs to S,°(0, ~), and 
4 pa0 
(2) f(x) = (2) g(t) cos xt dt. 


Here we use the notation 


~~ 4 
f =lim | . 
a Tam Va 


Received September 20, 1959. 
19 














20 A. P. GUINAND 


That is to say that the class S,°(0, ©) is a subclass of L?(0, ©), and that 
it can be described as a self-reciprocal convergence class for Fourier cosine 
transformations. 

This theory has recently been extended by Miller (5) to cover wider sub- 
classes of L*7(0, ~) and more general transformations. A disadvantage of 
‘Theorem 2 is that, although the result is simple and easily applied, the proof 
of Theorem 2 is indirect and it uses results from the Plancherel theory. 

In the first part of the present paper I show how to find narrower self- 
reciprocal convergence classes for Fourier cosine transforms, and I give a 
direct proof of the Fourier inversion formula without using the Plancherel 
theory for one such self-reciprocal convergence class. 

In the second part of the paper I prove analogues of Theorems 1 and 2 for 
Fourier series. I define a class S,*(0, 2x) of functions f(x) of period 2 and a 
class }>;7 (— ©, ©) of sequences {c,} (mn = 0,+1,+2,...,) which are 
reciprocal convergence classes for Fourier series in the sense that: 

(i) if f(x) belongs to S,?(0, 2x) then it has a Fourier series 


(3) fe)= ¥ ae™ 


which converges for x # 0 (mod 27), and {c,} belongs to }°;7(— ©, @); 
(ii) if {c,} belongs to }>;7(— ©, ~) then >...” c, e®* converges for all 
x #0 (mod 2) and defines a function f(x) belonging to S,?(0, 27). 


Part I: FourrER INTEGRALS 


1. Reciprocal classes and Mellin transforms. If f(x) and g(x) are 
Fourier cosine transforms connected by the equations (1) and (2), and §(s) 
and @(s) are their Mellin transforms then, formally (7, p. 213), 


(4) G(s) = RK(s) F(1 — s) 
and 
(s) = RK(s) G1 — s), 


where 
2 4 
R(s) = = I'(s) cos $sz, 


and consequently 
(5) 
for all real ¢. 
From the L* theory of Mellin transforms (7, p. 94) it follows that if f(x) 
belongs to L?(0, ~) then §(s) belongs to 2°(— ©, ~). Hence by (4) and (5) 
it follows that G(s) also belongs to 2°(— ©, ~) and consequently g(x) be- 
longs to L?(0, ~), as required by the Plancherel theory. 
A similar argument can be used to show that the class S,°(0, ©) is self- 





R(>F + it)| = 1 








we <> 3 














RECIPROCAL CONVERGENCE CLASSES 21 


reciprocal for Fourier cosine transformations. If f(x) belongs to S,°(0, ©) then 
the Mellin transform of x f’ (x) exists and is 


x 
(6) lim. xf’ (x)x** dx 


X +- 1/X 


= Lim. {lee Hix —s5 J fox ax\ 


= — s §(s), 


since the integrated terms vanish for R(s) = 4 by Theorem 1. Hence s ¥(s) 
belongs to 2?(— ©, ~) and it follows ee (4) and (5) that s G(s) also 
belongs to 2?(— ©, ~). Then, reversing the above were Bg it follows that 
g(x) belongs to S,°(0, ~), as required by Theorem 2. 

Now the same procedure can be used when §(s), instead of being multiplied 
by —sasin (6), is multiplied by some other suitable function of s. For example, 
put 


(1 — s) = ¥(s)/T(s) 


and assume that #(s) belongs to £?(— ©, ©). Then #(s) is the Mellin trans- 
form of a function ¢(x) belonging to L?(0, ©). Further I'(s) is the Mellin 
transform of e~* and consequently the relationship 


(7) w(s) = ['(s) (1 — s) 
corresponds to (7, p. 213) 
(8) fx) = [eo at 
From (4) and (7) we have 
G(s) = R(s)T(1 — s) @(s) 
= {ai TA) 


Hence, by (5), on R(s) = 3 


Ot ir) #(s)}. 





= |&(s)| 





G(s) | = 
P(s) 


and so @(s)/T(s) belongs to @7(— ©, ~). 
Reversing the argument from (7) to (8) it follows that there is a ¥(x) 
belonging to L?(0, @) for which 


g(x) = f-e*'ve a, 


and we have the following result. 

















22 A. P. GUINAND 


THEOREM 3. If f(x) is the Laplace transform of a function of L?(0, @), and 
g(x) is its Fourier cosine transform, then g(x) is also the Laplace transform of a 
function of L?(0, ©). 


Let us now make the following definition. 


Definition 2. The function f(x) is said to belong to the class A?{h(x)} if there 
exists a function ¢(x) belonging to L*(0, ~) such that 


fle) = "net oo at 


for all x > 0. 
Then Theorem 3 states that the class A*(e~*) is self-reciprocal for Fourier 
cosine transformations. 


The same type of argument can be used to prove the following more general 
result. 


THEOREM 4. Jf h(x) belongs to L*(0, ~) and has a Mellin transform $(s) 
satisfying 


i + it) | 


eG _ it) | 


for all real t then A*\h(x)} is a self-reciprocal class of functions with respect 
to any general transformation of the Fourier type. 


For general transformations see (7, ch. VIII). 


2. Symmetrical convergence theorems by direct methods. The argu- 
ments of § 1 do not prove that the Fourier integrals (1) and (2) converge, 
and they use the L? theory of Mellin transforms. If we consider the class of 
functions 


A?(e-##*) 


we can derive a symmetrical convergence theorem for the Fourier cosine 
transformation by a direct method. The result is: 


THEOREM 5. If f(x) belongs to the class 


A’ (e hr? ) 
then 


o\t po 
(9) g(x) = (2) J f(t) cos xt dt 


converges for x > 0, g(x) also belongs to 


A(e* ), 
and 








RECIPROCAL CONVERGENCE CLASSES 23 


a (10) f(x) = (2)' Jew cos xt dt 
for x > 0. Further, if, for x > 0, 
| a fe) = fe" ow ae 
and 
g(x) = fev ae 
in accordance with Definition 2 then 
. 


' ai,f{! 


almost everywhere. 


) From Definition 2 there exists a function ¢(x) of L*(0, ©) satisfying (11). 
Hence 


9\4t fo 
(13) g(x) = (2) S40) c0s xt at 
; 2 4 a CO ans 
= \|- f cos xt dt f o(uje™* du 
TT 0 0 
2\' ¢~ Me ee 
= 2) f $(u) du | e* cos xt dt, 
T 0 0 


provided that this formal process can be justified. Now 


¥ > 
(2) f em cos xt dt = 1 - ge*/u? 
T e : 
so (13) becomes 


(14) ee) = [omer & 
0 u 


- [tel era 
0 

= f vite” dt 
0 


by (12). 
We can justify this process by the following three lemmas. 


Lemma 1. Jf Vi; > V > 0, y > 0, then 


Vi ~4o2 2 -4v? 
f e cos yo dv| <—e : 
Vv y 














24 A. P. GUINAND 


Proof. 


V1 . V1 Vi 
- y Vv Ydv 


Vi 
f e*” cos yu dv 


Hence 








l , -ay: _ly,? in : 
<-(7" + 6™ ++ f ve?” dy 
¥y y¥vv 


1 - r2 _ r,2 72 r.?2 
= (et? 4 etm) 4 Ea? _ im) 
a ¥ 

= 2,4" 

¥ 


Lemma 2. If 7, x, 6 are positive real numbers and (x) belongs to L?(0, @) 


then 
» ‘ 2,2 
f cos xt dt f e*™ "o(u)du 
4 0 


Proof. Consider 
Ti cy - 
f cos xt af ow o(u) du 
T 0 


where 7, > T > 0. Since ¢(x) belongs to L*(0, ~) it follows that ¢(x) 
belongs to L(0, 6) for any finite 6, and that (16) converges absolutely. Hence 
(16) is equal to 








dO 
(15) < 254 f lowlanf. 


(16) 








ty eT) -— | 
(17) lf o(u) du J et” cos ut a 
0 T 
t) Ti 
= | o(u) du f e” cos = dy 
0 u Jt u 





9 ¢* ae 
< f |o(u)|e*?™ du 
x VJ0 


by Lemma 1. By Schwarz’s inequality (17) is less than or equal to 


(18) 2 1 f “lou)/ au {fier aut 
< 2 { fio! au\" , fre “~ au‘ 
= Bef owl aut 


Making 7 and 7; tend to infinity it follows that the double integral in 
(15) converges, and (15) follows from (18) if we keep T fixed and make 7, 
tend to infinity. 











RECIPROCAL CONVERGENCE CLASSES 25 


LemMa 3. If x is real and positive, and (x) belongs to L?(0, ~) then 


(19) f cos xt dt four ™*au 
0 0 


converges and is equal to 
(20) J o(u) du f et” cos xt dt. 
0 0 


Proof. The inversion of order of integration 
| o(u) du few * cos xt dt = J cos xt dt J o(u) é a 
s P ; , 


is justified by absolute convergence since 


J Jow)iae J le*** cos xtldt < | |o(u)|du | ett ay 
4 0 , 


= (24)! Pew 
’ u 
oo 7 ' 
< (2x)! if |o(u)|* au} if du\ 
$ Js ue) 
(22) ow . i 
” a iw |o(u)| dug . 


Sienietmte < {fasr ae {feo 
an ~{ fiver! {ea 


: lijpo ., Vs 
< J. | (u) aut if . dv 


ai ‘ i) 
= J4,t,4 if \o(u)| dug . 
\ 0 4 


Also 


(22) cos xt at | o(u) ot? dy — f o(u) du J et” cos xt dt 
0 0 0 


7 a of ' 
- f cos xt dt f o(u) et? dy — | o(u) du | eo cos xt dt 
0 0 0 


ea 
0 


— o(u) du J e**” cos xt dt 
é 0 
ad 


*300 Or) is r aa 
= | cos xt at | o(u) eo? du — | o(u) du | et” cos xt dt 
0 0 


0 0 


oT ry 7 * x of — 
= J cos xt dt o(u)e BPP yg + j cos xt at | o(u) et dy 
wo0 ~/T 0 


0 
if A 
~s (z) o(u) =. 
é V0 u 














26 A. P. GUINAND 


eT aé 
(23) f cos xt dt | o(u) o* du| 
0 0 I 
T 3 
< f af |o(u)| du 
0 0 
— y ay 
<T if |p(u) |" du 4) du 
0 7 0 J 
- j 
= re} f o(u)[*aut ; 


Hence, by Lemma 2, (21), (22), and (23) 


if cos xt af o(u) eh? du — J “o(u) du fe ee nos xt a 
93 ) V4 
< 73! + ae + rte \f |o(u)|* dug ‘ 

x7 0 


This can be made arbitrarily small by choosing 7 first and then making 6 
sufficiently small, and Lemma 3 follows. 


Proof of Theorem 5. Lemma 3 justifies the result (14). Further ¥(x) belongs 


to L*(0, ©) since 
ivetes = 74 |o(2)| ac 
- J loqol? du. 
Hence g(x), defined by (9), belongs to A*(e~!**). Then repeating the preceding 
argument 
(2) ~ elt) cos xt dt = f7 (2) ete ay 


= [ome bet oy 
(x) 


since (12) implies that for almost all x 


oe = 4(7), 


x 


This completes the proof of Theorem 5. 


Part II: Fourrer SERIES 
3. The class of functions S,°(0, 27). 


Definition 3. If f(x) is a periodic function of period 27, is the integral of its 


tio 


tio’ 


len 


He 


by 
let 


anc 


(26 


Int 





RECIPROCAL CONVERGENCE CLASSES 27 


derivative, and (sin } x) f’(x) belongs to L*(0, 27), then we say that f(x) be- 
longs to the class S,°(0, 27). 


Definition 4. If f(x) is a periodic function of period 27, and is such that there 
exists a function ¢(x) of L?(0, 2x) for which 


(24) f(x) = cosec 4 ef o(t) dt 
and 
(25) f sw dt = 


then we say that f(x) belongs to the class S,°[0, 27]. 


These definitions give two ways of characterizing the same class of func- 
tions. Properties of this class of functions are given by the following theorem. 


THEOREM 6. The classes S,°(0, 24) and S,*(0, 2x] are identical, and all func- 
tions f(x) of either class belong to L*(0, 2x). Also x* f(x) and x* f(2x — x) 
both tend to zero as x > + O. 


This result is analogous to results given by (4). 
To prove the result we use Lemmas 4, 5, and 6. 


Lemma 4. If f(x) belongs to S,°(0, 24] then x! f(x) and x! f(2x — x) both 
tend to zero as x — +0, and f(x) belongs to L*(0, 2x). 


Proof. As x — +0, by (24) and Schwarz’s inequality 


If x) |’ < cosec* § x 1J leo atl fa 


= o(x~*) 


Hence x? f(x) + 0 as x + +0. Further 


f(2e% — x) 


2e—z 
cosec ax f o(t) dt 
0 


— cosec 3 xf o(t) dt 
2r—z 


by (25), and a similar argument shows that x! f(2x — x) ~0asx — +0. Now 
let 0 <a <b < 2z, put 


d(x) = J $(t) dt, 


and suppose that f(x) is real. Then 
b 
(26) pa dx = f cosec’ § x {dx(x)}* de. 


Integrating by parts (26) becomes 











28 A. P. GUINAND 


ed 
[—2 cot 4x{ ¢:(x)}*2 + if cot 4x (x) d1(x) dx. 


As a —> +0 and 6 — 2x — 0, this is 


b 
o(1) + sf cos $x (x) f(x) dx 


a. | 4 +d i ; 
< o(1) + Hf cos 4x 1o(e)|* ax iJ icx)|* ax} 


vie a Uys d P } 
< o(1) + 4f |o(x)|° dx ¢ 4) f(x) |" dx . 


»d \4 
iJ Lf (x)|* dx 


and taking the limit as a — +0, 6 — 2x — 0, we have 


ole )3 ole )4 
if f(x)? axt < 4 f oe) ?aet , 


( 


Dividing by 


and hence f(x) belongs to L*(0, 27), as required. If f(x) is complex the result 
follows by splitting into real and imaginary parts. 


Lemma 5. If f(x) belongs to S,2(0, 2) then x! f(x) and x f(2x — x) both tend 
to zero as x — +0, and f(x) belongs to L?(0, 27). 


Proof. By Definition 3 there exists a function ¥(x) = sin $x f’ (x), belonging 
to L?(0, 2x), such that 


(27) f(x) — f(x) = J cosec 4 t W(t) dt. 
Suppose that f(x) = 0 and consider the behaviour of f(x) as x — +0. Choose 
0 < 6 < ws0 that 


od 


| lw(t)|" dt <e. 
e/7( 
Then for0 <x <6 


or or] 
If(x)| < J cosec 4 t| p(t) |\dt +f cosec 4} t| p(t) |dt 
8 z 
4 


2 


¥ 1 j - tf r 24 \ 
< | cosec § t\p(t)\dt + yp iv) dt J cosec” } ¢ dt ¢ 
ry .Vr ei j 


< cosec 4 sf |W (t)|dt + é'(2 cot tx — 2cot 45), 
0 


Hence as x — +0 


oA ee 


' 





sir 


wl 





ult 


nd 


ng 


se 





| 








RECIPROCAL CONVERGENCE CLASSES 


O(x) os O(e*) (x cot 4x — x cot 45)! 
= O(x*) + O(e) 
= o(1) 


xf (x) 


since x cot }x-—2asx—-0. 


A similar argument also shows that x! f(2x — x) +0 as x — +0. 
Now suppose that 0 < a < #, and that f(x) is real, and put 
fi(x) = fro dt. 
0 
Hence f;(x) = o(x*) asx +0. Also 


= | J cose 4 t y(t) a 


i ! 
< S| W(t) y fom May 
< an wlan 2 cot bef : 


whence /,(x) is bounded for the whole interval (0, x). 
Now let 0 < a < x and consider 


fe) "dx = fe dx J cosec kt w(t) dt 


= [sce sce) | + | f(x) cosec 4 x W(x) dx 
As x > x — 0, f;(x) is bounded and f(x) + f(r) = 0, and as x + +0 


filx)f(x) = o(x!)o(x-) = o(1). 
Hence 


f {f(x)} "dx = o(1) +f fi(x) cosec § x W(x) dx 


< o(1) + iJ 


file) = ff at 
= f af cosec 4 u ¥(u) du 
0 t 
= f cosec 4 u W(u) du fia + J cosec 4 u W(u) du j dt 
0 0 z 0 


= f u cosec 4 u ¥(u) du — x f(x) 


2 } ( . 4 
fix) cosec px) axt iJ H(x)|? dx! , 





Now 











30 A. P. GUINAND 


by (27). Hence 


(29) f(x) cosec 4 x = cosec bx f u cosec } u ¥(u) du — x cosec } x f(x) 
0 
Now 
2 < xcosec 3x Cr 
for 0 < x < x. Hence x cosec $x ¥(x) belongs to L?(0, r), and by Lemma 4 
ez 
(30) cosec 4 | u cosec 4 u W(u) du 
0 
belongs to L?(0, x). 
Substituting (29) in (28) and using Minkowski’s inequality we have 


J vera < 0(1) + LS | 


2 








act" 


‘ JiGomea x {five act 


1 


z 
cosec 4 xf u cosec 4 u W(u) du 
0 


+ ns f(x) |° ‘act | x iS ly(x)|° as} 


since we know that (30) and ¥(x) both belong to L?(0, r). That is, 


z 
cosec 4 xf u cosec 4 uy (u)du 
0 





cao +f 





ac\" 


(31) J ver dx < A+ BY [cel acy” 


where A and B are constants independent of a. Now unless f(x) vanishes al- 
most everywhere in (0, 7) we can find an a; and a k such that 0 < a; < + 


and 


( ar ; 
_ >k>O0. 


iS \f(x)|? ax} <auf lf (x)|° as} +B 


+B 


From (31) 


<j 
for a < a,. Hence 
eee 
f lf@)Pdxg <S+B 
0 k 


and f(x) belongs to L?(0, x). Combining the above with a similar argument 
for the interval (x, 27) we find that f(x) belongs to LZ?(0, 27). 





la 4 


ent 





RECIPROCAL CONVERGENCE CLASSES 31 


If f(w) is not zero the above argument shows that f(x) — f(x) belongs to 
L?(0, 2x), so f(x) belongs to L*(0, 27). 

Lastly, if f(x) is complex the result of Lemma 5 follows by splitting into 
real and imaginary parts. 

LemMMA 6. The classes S,°(0, 2x) and S,°*(0, 2x] are identical. 


Proof. With $(x) and (x) as in Lemmas 4 and 5 we have 
(32) f(x) = cosec ax ff o(t) dt = f(r) — f cosec 4 ¢ w(t) dt. 
0 z 


By differentiation 
(x) = sin $x f’(x) + 4 cos $x f(x) 
and 
¥(x) = sin 4x f’(x) 
almost everywhere in (0, 27). Hence 
(33) o(x) = ¥(x) + 4cos $x f(x) 
almost everywhere in (0, 27). 

Now if f(x) belongs to S,? [0, 27] this means that ¢(x) belongs to L*(0, 27), 
and, by Lemma 4, so does f(x). Hence from (33) ¥(x) also belongs to L?(0, 27); 
that is, f(x) belongs to S,?(0, 27). 

Conversely if f(x) belongs to S,?(0, 27) then by Lemma 5 it also belongs to 
L*(0, 2x) and ¥(x) belongs to L?(0, 27). Hence from (33) ¢(x) also belongs 
to L?(0, 27). Also by (32) 


fo dt = sin 4 x f(x). 


By Lemma 5 x! f(2x — x) +0 as x — +0. Hence 


2r 
f o(t) dt = lim {sin $(24 — x) f(2e — x)} 
0 z++0 
= 0. 


That is, f(x) belongs to S,?[0, 2x], and this completes the proof of Lemma 6. 
Combining Lemmas 4, 5, and 6 we have Theorem 6. 

We also require the following result to connect S,°(0, 2x) with Fourier 
Series in § 5. 


THEOREM 7. The class S;?(0, 2) is identical with the class of functions f(x) 
of period 2x which can be expressed in the form 


(34) fe) =e J xa 


l—e 


where x(x) belongs to L?(0, 2x) and 


Qr 
(35) f x(t) dt = 0. 
0 











32 A. P. GUINAND 
Proof. By (24), (25), (34), and (35) we require that 


(36) pu J x(t) dt = cosec bx f o(t) dt 


¢ 


where 


| o(t) dt = | x(t) dt = 0 


and $(x) and x(x) belong to L?(0, 27). Now (36) gives 


(37) f x(t) dt = 2ie i o(t) dt 


and hence 


ez 


(38) x(x) = 2ie** o(x) + eb” | o(t) dt 


almost everywhere in (0, 27). If (x) belongs to L?(0, 27), so does 


J o(t) dt, 


and hence from (38), x(x) belongs to L?(0, 2x). A similar argument shows 
that @(x) belongs to L?(0, 2x) if x(x) does. 
Finally if we put x = 27 in (37) we have 


J x(t) dt = —2i | o(t) dt. 


Hence the vanishing of either of these integrals implies the vanishing of the 
other. 


4. The class of sequences > ;*(— ~, ~) 


THEOREM 8. /f {c,}, (wm = 0,1,2,...,) is @ sequence of complex numbers 
such that the series 


» 9 
> mM \Cn — Cn+1| 
n=1 


is convergent, then 


(i) c, tends to a finite limit las n — @~, and 
(39) C, — | = o(n-), 


(ii) the series 


converges. 


Proof of (i). \fm > n> 1 then 











RECIPROCAL CONVERGENCE CLASSES 33 





(40) lca — Cm| = } (c, — cvs) 


I 
° 
= 


oe” 


Hence, by the principle of convergence, c, tends to a finite limit / as n — @, 


i and making m — © in (40) we have (39). 
| LEMMA 7. Jf {a,j}, (mw = 1,2,3,...,) is amy sequence of complex numbers 
{ and N a positive integer then 


N—1 N 
6 pe n*la, — Geil’ + 2Nlay|* > i la, |*. 


n=1 n=1 
Proof of Lemma 7. We have 
6n*|a, — Ani)? + (2m + 1)\an4il? 


‘ ° ‘ | 9 « » 
= (2n? — 2n)idq — Gn4il? + {2mM\dn — Anyi) — |Qn4r)}? 
+ 2m} \an — Gnarl + ldnoal}? 





s Daf] : 
> 2nt\a, — Ansil + |ansil}? 
> 2ni\a,\* 
since 2m? — 2n > 0 for all integers m and 
a, — An+1| + ldn+1| > |Ap}\. 
le Hence 
6n?|a, — Gn4i|? + 2(m + 1)langi|? — 2nla,|* > |ansrl®, 
and the lemma follows on summing over nm = 0,1, 2,..., VV — 1. 
rs 





Proof of (ii). If we put a, = c, —/ then by (39) Njay|* tends to zero as 
N — @. Hence 


. = 2) 2 - 2 
6 _ n|Cn — Cail > = len — 1 
n=l n=} 
and the latter series converges. 


Definition 5. If {c,}, (n = 0, 1, +2,...,) is a sequence of complex num- 
bers such that 


@ 
2 {2 
) n'\Cn — Cn+a| 
n=—c 


converges, and if the limits to which c, tends as nm ~+ © are both zero, then 
we say that the sequence {c,} belongs to the class }>;7(— ©, @). 














34 A. P. GUINAND 


5. The convergence of Fourier series for S,*(0, 27). 


THEOREM 9. If {c,} belongs to the class ¥>;°(— ©, ~) then the series 


a 


(41) 7 ae 


n=—o@ 


converges for all x not congruent to zero modulo 2x, and its sum defines a function 
f(x), belonging to S,?(0, 2x), of which (41) is the Fourier series. 


Proof. Consider the series 


(42) zed”, 
n=1 
and put 
(Cr — Cat1) = Xn 
Then 
Gg = (Cn = Cn+1) + (Cu1 — Cn+2) + eee 
of & 
rom «(COT 
Hence 
N N @ 
(43) > Cn e™ - } e™ } in -) 
n=1 n=1 mn 
N x r a N 
r inz r inz 
= —_— é — 
> T n=l +2 T n=l : 
N ( a(r+1)z 7 ( 4(N+1)2 

2), atid, x >] 

at es ord f+ ons 7-1 J° 
Now 








r=N+11 7 =N+1 
ia o(N~) 
since the series 
(44) De |xel" 


converges by hypothesis. Hence by (43), if x # 0 mod 2z 
N N 
nm. ] r r+l1)z — 
dD ae™ = az DD * fe** — 1) + on), 
n=1 é — 1 ral 


and the series on the right is absolutely convergent. Hence the series (42) 
converges, and 


(45) > Cn e™ _ - Xr i frrne = 1}. 


iz 
n=l é —_ l oan 











He 





(42) 














RECIPROCAL CONVERGENCE CLASSES 


Since the series (44) converges the series 


= X, a*™ 


r= 


converges in mean square to a function x(x) belonging to L?(0, 27), and is the 
Fourier series of this function. Hence, by the Fourier series integration theorem 
(6, p. 419), 


f x(t)dt = —i © * (e** — 1), 
0 


r= r 
and in particular 


Hence by (45) 
(46) > ae™ = — af x(t) dt + x 
-  ¢ 0 rai 7 


By Theorem 7 it follows that (46) is a function of the class S;°(0, 27). A 
similar argument for negative m shows that the whole series (41) converges 
for x # 0 mod 2z, and that its sum is a function of the class S,°(0, 27). 

Since S,?(0, 27) is a subclass of L?(0, 237) the series (41) must be the Fourier 
series of its sum. 


THEOREM 10. If f(x) belongs to the class S,?(0, 24) then it has a Fourier series 
— inz 
> Ge 


which converges to f(x) for all x not congruent to zero modulo 2x, and the sequence 
{cn} belongs to the class ¥;7(— ©, @). 


Proof. By Theorem 6 the function f(x) belongs to L*(0, 27). Hence it has a 
Fourier series 


(47) f(x) ~ > ae 


n=—a 
for which c, tends to zero asa > +. 
By Theorem 7 there exists a function x(x) of L?(0, 27) such that 


(48) f(x) = —— af x(t) dt 
l = ¢ 0 
and 
2r 
f x(t) dt = 0. 
0 
Hence if 


x(x) ~ pm xre 


r=—o 








36 A. P. GUINAND 


then xo = 0 and 


D |x” 


T=—om 


converges. By the Fourier series integration theorem and (48) 


= a = Xr ,trz 
fe) = joe Lh Ft 1). 
Hence 
(49) fle) (l-*) = >) Ate — D & 


since both of these series converge absolutely. 
Now 


f(x) A-e&*)~ (1 -—&*) > ae™ 


n=— ao 


= ¥ G—cnde™. 


By (49) and the uniqueness theorem for Fourier series of functions of L*(0, 2) 
it follows that for » # 0 


Xn _ 
—— — Cn Cn+ 1- 
n 
Hence the series 
« 2 ‘ @ . 
De 2 len — Catsl = DY |xel” 
n=--c@o n=—a 


converges, and therefore the sequence {c,} belongs to the class }°;7(— ©, ©), 
as required. 

By Theorem 9 the series (47) converges for x # 0 mod 27 to a function of 
S,°(0, 2x) which must therefore be equal to f(x) almost everywhere. From 
(34) functions of S,?(0, 2x) are continuous for x # 0 mod 27, so the sum of 
the series (47) must be equal to f(x) for all such x. 


REFERENCES 


1. A. P. Guinand, Summation formulae and self-reciprocal functions, 1, 11, Quart. J. Math, 
(1), 9 (1938), 53-67, and (1), 10 (1939), 104-118. 








2. On Poisson's summation formula, Ann. Math. (2), 42 (1941), 591-603. 

3. General transformations and the Parseval theorem, Quart. J. Math. (1), 12 (1941), 
51-56. 

4. G. H. Hardy, J. E. Littlewood, and G. Polya, Inequalities (Cambridge, 1934), 239-246. 

5. J. B. Miller, A symmetrical convergence theory for general transforms, Proc. London Math. 


Soc. (3), 8 (1958), 224-241. 
6. E. C. Titchmarsh, The Theory of Functions (Oxford, 1939). 
7. An introduction to the theory of Fourier integrals (Oxford, 1948). 





University of Saskatchewan 








—- 


$< re 





THE ANALYTIC CONTINUATION OF THE RIEMANN- 
LIOUVILLE INTEGRAL IN THE HYPERBOLIC CASE 


MARCEL RIESZ 


Introduction. In 1949 I published in the Acta Mathematica (vol. 81) a 
rather long paper: “‘L’intégrale de Riemann-Liouville et le probléme de 
Cauchy.”’ This work will be quoted in the sequel as Acta paper. Only minor 
local references to this paper will be made here, and knowledge of it is not 
required for the reading of the present article. The notations used here are 
slightly different from those used in my former paper. 

In the Acta paper I introduce multiple integrals J* and J¢ of the Riemann- 
Liouville type depending on a parameter a and converging for sufficiently 
large values of a. I give the solution of the Cauchy problem for the wave 
equation in a unique formula, the same for space-time of odd or even dimensions, 
implying an analytic continuation with respect to the parameter a. When 
this analytic continuation is carried out, it leads to final formulae of quite 
different types for odd or even dimensions, the one relative to even dimensions 
obeying the Huygens principle. 

The main difficulty concerning the analytic continuation was to prove that 
I® is the identity operator. My way of doing this was neither simple nor 
elegant. The principal aim of the present paper is to give a more satisfactory 
proof. 

I hope that the present approach will be useful in other connections as well. 
Indeed, this method of analytic continuation has found unexpected applica- 
tions in other fields. Here I only make reference to results of Gelfand and 
Grajew.' 


1. Preliminaries. If the co-ordinates of a point x in m-dimensional space- 
time or Lorentz-space are denoted by x°, x',...,x"~', the metric form will be 
(1.1) (x, x) = (x®)? — (x")? — 1.1. — (x*™"")? = lax's’, 
where the ordinary summation convention is used. The square of the distance 
of two points x and y is given by 
(1.2) Ry = Pry = (x — y, x — y) = la(xt — y)(x* — y"*). 


The scalar product (a, 6) of two vectors a and 8, with the respective com- 
ponents a* and b*, is defined by 


Received October 19, 1959. 

‘See Appendix III of the book by I. M. Gelfand and M. A. Neumark, Unitdre Darstellungen 
der klassischen Gruppen (Berlin: Akademie-Verlag, 1957), also A.M.S. translations, Series 2, 
vol. 9, pp. 123-154. 


37 








38 MARCEL RIESZ 


(1.3) (a, 6) = lya'd*. 


Two vectors whose scalar product vanishes are said to be orthogonal to each 
other. In what follows, orthogonality and normality are always meant in this 
sense. 

According as the scalar square (a, a) of a vector a is (1) positive, (2) zero, 
(3) negative, the vector is said to be (1) time-like, (2) light-like or a null vector, 
(3) space-like. A time-like or a light-like vector a is called positive or negative 
according as its time component a° is positive or negative. 

Time-like unit vectors u and space-like unit vectors v are defined by the 
relations (u,u) = 1 and (v,v) = —1 respectively. 

The light cone or characteristic cone with vertex a is given by the equation 
(x —a,x —a) =0. The positive and negative half-cones correspond to 
x® — a° >0 or < 0 respectively. These half-cones will be called positive and 
negative light cones in the sequel. 

Consider now a p-dimensional (curved) variety S whose points y are referred 
to p parameters A!', A*,..., A”. The p-dimensional volume element dS of S, 
or alternately surface element if 1 < p < m, can be defined in the following 
way (cf. Acta paper pp. 44-45). Let ds? = (dy, dy) = }>:% yadA‘dd* be the 
square of the arc element in S. Form the determinant y = |y «|. Then 


(1.4) dS = ~/\y| dv'dn’ .. . dd’. 


An (m—1)-dimensional surface is said to be space-like if its normal is time- 
like. Let S be a space-like surface. Suppose that the negative light cone C’ 
with vertex x and the surface S enclose a bounded domain Ds”. We shall 
consider functions defined in domains including Ds? and make the blanket 
hypothesis that the functions and all their derivatives with respect to the 
Cartesian co-ordinates which explicitly or implicitly enter into our computa- 
tions exist and are continuous. We express this by saying that the functions 
are well behaved. The same phrase will be used in an appropriate sense in 
connection with the surface S and functions defined on S. 

We form the volume potential 


‘- e 1 a miei 
(1.5) I'f(x) = H.(a) Js (y)rzy aV, 
where dV = dy*dy'...dy™— is the volume element of m-space and 
(1.6) H,.(a) = 2°"-2*"'P (4a) P'(4(a + 2 — m)). 


The integral in (1.5) converges for a > m—2 (cf. Acta paper, p. 31), or more 
generally for Rea > m-2, if we admit complex values of a. Similarly, our 
subsequent assertions about convergence of integrals or analytic continuation 
of I* or I** (see below) remain valid for complex a, if we replace all inequal- 
ities of the type a > ap by Rea > Re av. 

Besides the volume potentials we also consider potentials of a simple layer 
and of a double layer. 














Oo FS « 





——— 














THE RIEMANN-LIOUVILLE INTEGRAL 39 


Let S* be that part of the surface S which is interior to the cone C*. Denote 
further by m the positive unit normal to S and let g and A be two functions 
defined on S. We write 


- So a ye ws a—m y17 
(1.7) Tef, g, h(x) = Tw J, Sore dv 


+ ar S Aeon — h(y) < nant dS, 
where dS is the surface element of S (cf. formula (1.4)). 

The simple layer converges for a > m—2, while the double layer, whose 
kernel has a stronger singularity, converges only for a > m (cf. Acta paper, 
pp. 48-49, and §4 of the present paper). 

We will show that by virtue of our hypotheses about the behaviour of the 
surface S and the functions f, g, 4 the integral J,* can be continued analytically 
down to an arbitrary value ap 5 0. Moreover, if ao < 0, then 


lef, g, h(x) = I°f(x) = f(x). 


For a specification of the derivatives needed for different purposes cf. Acta 
paper, pp. 59-60, 64, 223. 

Some simple facts concerning the analytic continuation of the ordinary 
Riemann-Liouville integral in one dimension will be needed in the sequel 
(cf. Acta paper, pp. 14-16). 

Set 


(1.9) J*f(0) = aay feat 


If f(t) is continuous in the closed interval [0, 5], this integral is convergent 
for a > 0. If for k S m the derivatives f“(t) exist and are continuous in 
[0, 6], then J*f(0) has a holomorphic continuation to all a > — n. Moreover, if 
pb is an integer 0 S p <n, then 


(1.10) J~f(0) = (—1)"f(0). 


(As a matter of fact, only the case p = 0 will be used explicitly in the sequel.) 
To prove this, set 


Then we have for a > 0, to begin with, and subsequently by analytic con- 
tinuation, for alla > — n 


7 - _ i b ane 1 n—1 fO) _ hte 
(111) J¥(@) = at J, UO — P(t)" dt+ Te Bk bo: 


*(a) k=l a a) 
Indeed, the last integral is convergent and the whole expression (1.11) is 
holomorphic for a > — nm. J*f(0) reduces to (— 1)*f™(0) at a= —p, 


since I'(a) has a simple pole with residue (— 1)?/p! at this point. 








40 MARCEL RIESZ 


The following extension will also be needed, and the corresponding result 
will be quoted as the extended one-dimensional case. Its verification is left to 
the reader. 

Let f(t) also depend on a. If f(t) and its derivatives with respect to ¢ up to 
the order m are continuous in the closed interval [0, 6] and moreover are 
holomorphic in a fora > — n, then our above statement and its proof remain 
valid, except for some slight changes in the notations. 


2. A co-ordinate system. We place the origin O at the point x and will 
eventually refer the domain Ds® to co-ordinates which are to be introduced 
here. We denote a fixed negative time-like unit vector by a and a variable 
space-like unit vector orthogonal to a by v. In a suitable Lorentz frame a and 
v can be written a = (—1,0,...,0) and» = (0,0',...,v"~') with > (o*)? = 1. 
If the vector v issues from the origin, its endpoint describes the unit sphere 
Sm-2 lying in the (m—1)-plane orthogonal to a. We write out explicitly that 
(2.1) (a,a) = 1, (v, v) = —1, (a, v) = 0. 

An arbitrary position vector y can be written 


(2.2) y = ta + pv, p = 0. 
We always suppose that also ¢ 2 0. This inequality is obviously satisfied in 
the domain Ds”. 

The relation (2.2) can also be written 

y = (t+ p)(a + 0) + 4(t — p)(a — »). 

Furthermore, if we set 
(2.3) b= (a+ 0),c = 4(a —2), 
then 


y= b+ e+ ple = + 0)(b+L—2e). 
ti+p 


Setting now 


‘ 7 ™ 

(2.4) re te’ o=t+ op, 
we obtain 

(2.5) y = o(b + 70). 

The inverted formulae (2.4) are 

(2.6) p = $o(1 — 7), t = $e(1 + 7). 
It follows from (2.1) and (2.3) that 

(2.7) (b, b) = 0, (c, c) = 0, (b,c) = 3. 


Hence 6 and ¢ are (negative) null vectors. 





Se 


—E——— 


in 





EE 





THE RIEMANN-LIOUVILLE INTEGRAL 41 


The variables r and o and the angular variable v which varies on the sphere 
Sm: and determines the vectors 6 and c will be our new co-ordinates. Here 
are the principal merits of r and ¢. The square of the Lorentz distance of a 
point from the vertex can be expressed and “‘separated”’ in r and ¢. The vertex 
of the cone C® is given by the single equation ¢ = 0, while the cone apart 
from the vertex is given by the equation +r = 0. The derivatives 3°f/ dr” 
of an arbitrary function f(y) vanish at the vertex since they contain the factor 
a”. 

We prove these assertions and complete them in certain respects. The 
square r® of the Lorentz distance is according to (2.2) and (2.1) 


i+. 


Hence 
(2.8) r? = (y, y) = o*r. 


The same relation also follows from (2.5) and (2.7), since (2.7) gives 
(6 + rc,6 + 7c) = r. The equation of the cone C® is (y,y) = 0. At the 
vertex ¢ = 0, while + is indeterminate. On the cone, except at the vertex, 
r = 0,¢ > 0. It follows from (2.4) that 0 < + S 1 inside the cone and that, 
in particular, r = 1 on the axis y = oa (or p = 0) and only there. 

We always have o 2 0, according to (2.4) and the inequalities ¢ = 0, 
p 2 0. The equation ¢ = const. = y > 0, which, in view of (2.4), is equivalent 
to t+ p = ¥ is the equation of a positive light cone Cy,, with the vertex ya. 
It is clear that the inequalities 0 S$ r S$ 1 and 0 So S y characterize the 
interior and the boundary of a double cone D,,° limited by the negative 
light cone C® and the positive light cone C,,. 

From now on we make ample use of our hypothesis that the function f(y) 
is well behaved (cf. p. 38). We have 
(2.9) 


ay_ a c i 
= < [ob + 1c)] = 0c, S¥=0, p=2,3,.... 


From this it follows for any function f(y) 
af ( , ) a 
7: “¢ ps co }f, where 0 = ByF 
and, more generally, for any positive integer p, 
3” Pp 
(2.10) a = “(x <a) f. 
T 


This proves our assertion about the behaviour of the derivatives with respect 
to r at the vertex. 


3. The volume potential. If the surface element of the sphere S,,_» is 
denoted by dS,—2, the volume element dV of the m-space can be written 











42 MARCEL RIESZ 
dV = p™~*dpdidS,,_». From (2.6) we have that p = 4e(1 — r) and that the 
Jacobian 


d(p,t) _ 1 
d(c,r) 2° 


Making use of (2.5) and (2.8) we obtain after some simplifications 


1 a—mz17 
(3.1) H(t)" d\ 
gi-n ‘ 
= Fey lolb + reo eH (1 — 1)" “drded Sys 


In order to get /*f(O), we have to integrate this expression over the domain 
Ds°. However, it will be convenient to divide this domain into two parts and 
treat these parts separately. First we choose y small enough, so that the 
double-cone D,,? should be contained in Ds®. Then we divide the latter 
domain into D,,° and Ds® — D,,° and show by rather different methods 
that the corresponding parts of the integral J*, denoted incidentally by 
I and I,;*, are holomorphic for a > — 1 and that J,° = f(O), J,;° = 0, 
which gives that the original J°f(O) = f(O). By this our main objective will 
be attained. The more difficult part of the proof, the one concerning J,, will 
be carried out in the present section. The easy part J;; can be treated by a 
method similar to that used in §4 for a simple layer. Therefore it is postponed 
to §5. In the same section we apply our results concerning the analytic con- 
tinuation of a simple and a double layer to carry out the “‘unlimited"’ analytic 
continuation of the volume potentials. 

The integral of the right-hand side of (3.1) extended over the double-cone 
Dy,.° gives us the functional J*f(O) relative to this special domain. A very 
great simplification arises here from the fact that the limits of integration 
with respect to r and a are fixed, r varying between 0 and 1 and o between 
0 and y. Thus we have in the present case 


I*f(O) = J. aSy-1 fae ff... 


where the dots stand for the integrand given in the right-hand side of (3.1) 
Besides the formula (1.6) for H,,(a@) we shall need the relations 


el 

7 : I(r) T(s) 
3.2 feta -9a = Ne 
3.2) 0 ( P(r + s) 
(3.3) T(r) = w 2"'P(4r)r(4r + 4) 
and the explicit expression for the total surface |S,_.| of the sphere S,,—., 

; 2x4 

3.4 |\S.—2| = =. 
3-4) went (4m — 4) 


We develop f[o(b + rc)] in a finite power series in + with a remainder term. 


We have 























THE RIEMANN-LIOUVILLE INTEGRAL 


(3.5) f(y) = fle(b + re)] = 5 &4lo, 2) + Rul), 
where %,(¢,v) = Wore on oly , and 
(3.6) Ry(r) = mile aah Sle(o + #c)](r — 7)*~"d#. 


N is here a sufficiently integer, to be specified later. Obviously Ry(r) 
= O(r”). 


We first compute the org 


A(a) = z (a 2 f few + rc) } ™(1 — r)""“dr. 


This integral and all the integrals which follow are convergent for a > m — 2. 
On account of (3.5) we have 


G7) A(@) = ¥ Ayla) &(o,») + F a f Ry (r) 8-1 — 7) "dr, 


p=0 


where by means of (3.2) with r = sa +2—m)+p,s5=m-—1 





2° "I(m — 1) (}(a + 2 — m) + p) 
3. A,(a) = PS 
(3.8) (a) p\Hn(a)I'(4(a + m) + p) 
The most important term in (3.7) is Ao(a)®o(¢,v) = Ao(a)f(ob). In view of 


the expression (1.6) of H,,(a) we find 
2' "T'(m — 1) 
m3 ir T(da) PC a+ m 


Expressing 2*-'T'(4a) by means of (3.3), with r = a, we find after some 
simplifications 





(3.9) Ao(a) = 7 


2° "I (m — 1)T (ha + 3) 


(3.10) Ao(a) = Ko(a) » where Ko(a) = “iT (a + m)) 





1 — 
‘Tle ) 
Since ['(4) = x, we have 


4 lm - 1 
(3.11) K00) = ym bene i: 


According to (3.3), with r = m — 1, and to (3.4) 


' -@ — Pam —-3)__ 1 
(3.12) K,(0) = >, = [Smeal 


Our next step is to carry out the analytic continuation of the expression 


(3.13) Ada). f 0(o, v)o" ‘do = Ko(a) . ates Jf sob)o*ae. 
ad I'(a) Jo 


The integral converges for a > 0 and, according to whatewe know about the 
one-dimensional case (cf. p. 39), the analytic continuation of (3.13) is holo- 
morphic fora > —1. (For a = —1 the function I'(4(@ + 1)) has a pole.) For 











44 MARCEL RIESZ 


= 0 the expression (3.13) becomes Ko(0) f(O) = f(O)/|S,-2|. The integral 
of this constant value with respect to the angular variable is clearly f(O). 
Hence, and this contains virtually our main result, the term corresponding to 
pb = 0 in the /* relative to the double-cone D,,° yields exactly f(O) for a = 0. 

The terms in (3.7) with p > 0 are easy to handle. In analogy with (3.13) 
we have to consider the expression 


" 
(3.14) A, (a) A &,(¢,v)o" ‘do, p21, 
0 
where A,(a) is given in (3.8). We first note that 


r'(4(a + 2 — m) + p) ‘(4(a + 2 — m))P,(a) 


where P,(a) is a polynomial of degree p in a. Hence, after the same simpli- 
fications as those oe for Ao(a) we obtain 
, iia = 1 

A,(a) = K, (a) . ra *) » Where K, (a) = ple rears nt 
According to (2.10) 8?/dr? contains the factor o?, p 2 1. Hence all the 
integrals of the type given in (3.14) converge if a > — 1. Moreover, K,(0) 
is finite, 1/T!(0) = 0, consequently all expressions (3.14) vanish for a = 0. 

Since Ry = O(r%o”), the remainder term can be treated in an analogous 
way. It is clear that for the present purposes V may be any integer such that 
—}(1+m)+N2-—lor2N2m-—1. 

Summing up, it is now shown that the integral /*f(O) extended to the double 
cone can be analytically continued to all values a > — 1 and that for these 
values it is a holomorphic function of a. Moreover J°f(O) = f(O), that is J® 
is the identity operator in the case of the double cone. 

We could have gone a bit farther and established the possibility of the 
analytic continuation down to arbitrary negative values of a. However, one 
difficulty would have remained, the possible occurrence of (simple) poles at 
the negative odd integers = the poles of ['(4(@ + 1)). As a matter of fact, 
none of these poles actually occurs. Their disappearance must be the effect 
of the integration with respect to the angular variable, considered here in a 
very summary way. On p. 64 of my Acta paper I indicate how the holomorphic 
character of the unlimited analytic continuation can be established by an 
indirect method. This will be carried out here in $5. 


P,(a). 


4. Simple layer and double layer. We now pass to the simple layer 


(4.1) 7 = J. g(y)r* "dS 


considered in formula (1.7), where now the vertex coincides with the origin 
and r is written instead of r,,. The integral converges for a > m — 2 and has 
to be continued analytically for a S m — 2. The portion S® of the surface S 
can be parametrized by the variables 7 and v in the following way. Through 








oli- 





THE RIEMANN-LIOUVILLE INTEGRAL 45 


every point of S passes a unique ray issuing from the origin. On such a ray r 
and v are constant, hence also the vectors 6 and ¢ corresponding to v are 
constant, while o varies. If we write the point of intersection of the ray with 
the surface S in the form y = os(b + 7c), the equations ¢ = ¢s(r,v) or 
y = os(r,v)(b + 7c) and the additional condition 03751 yield the 
required parametrization of S°, because } and c depend only on »v. 

Since v is indeterminate on the axis y = oa, where + = 1, we divide S? 
into two parts S,° and S,® in the following way. With an arbitrary 6 such 
that 0 < 6 < 1 the first part will be given by 0 S 7 S 6 and the second by 
$<7rg 1. 

That part of the simple layer which relates to S,° is an entire function 
of a vanishing for all even integers S 0. Indeed the corresponding part of the 
integral in (4.1) never ceases to converge and H,,(a) has poles at these integers 
owing to the factor ($a) (cf. formula (1.6)). 

In order to treat that part of (4.1) which is taken over S,°, we have to 
express the surface element dS in a convenient way. The angular variable v 
on the sphere S,,.2 can be expressed by m — 2 local parameters ¢', ¢’, . . 
¢”"—*. Thus, according to (1.4), we can write in summary notations 


dS = G(r, v) . dr. I] d¢', dS,,-2 = O(v) . II do, 


hence dS = H(r, v)drdS,—2. 
We set 4(a + 2 — m) = 8 and can then write according to (1.6) 


(4.2) H,,(a) = Lp(a«)T(8) where L,(a) = 2°"~2*"P(4a). 


We also set g(y) = g(r, v) and recall the relation r? = o*r given in (2.8). 
Then we write that part of (4.1) which corresponds to S,° in the form 


> 1 1 4 . a—m 8—1 
(4.3) Ula) = L(a) J aS ra) f g(r, v)H(r,v)[os(r,v)\° "2 dr. 


Here H(r,v), os(t,v), g(r, v) are well-behaved even in + and v by virtue 
of our hypothesis concerning the surface S and the function g(y). Moreover 
as is bounded away from 0. Hence we can apply our statement concerning the 
extended one-dimensional case (cf. p. 40), which gives that (1/(T(8)) f3. rs 
is a holomorphic function of 8, hence also of a, and this is then also true for 
U(a). Owing to the presence of the factor ['(4a) in L,,(a), the function U(a) 
vanishes for a = 0 and a = a negative even integer. 

There is very little to change in the case of a double layer. It is easily seen 
that dr/dn = r—'(y, n) hence 





dr - (a — mye 


The scalar product (y, #) (cf. (1.3)) is a well behaved function in + and ». 
Owing to the lowered exponent the double layer integral in (1.7) converges 
only for a > m, thus the need of continuation begins already at m. 


a—m—2 





= (a — m)r (y, n). 











46 MARCEL RIESZ 


The part relative to S,° is again an entire function of a and vanishes for 
all even integers = 0. On the other hand, with 6’ = }(a — m) = 6 — 1, 





a— mon 28-1) 61 2 »1_ 2 » 
r(s) r(8) r(@— 1) ra)’ 


Thus, when treating the part relative to S,°, we obtain a formula of the same 
type as (4.3), 8’ = 8 — 1 playing for the double layer the same role as 8 
played for the simple layer, and the results are essentially the same. 

Our findings can be summed up as follows. Both the simple layer and the 
double layer potentials can be continued analytically to arbitrary values of a. 
They are holomorphic functions of a which vanish for all even integers < 0. 


5. The volume potential (continued). Now we return to the volume 
potential and clarify the properties of the part which relates to the domain 
Ds° — D,,° (cf. the beginning of §3). We want to prove that this part of J, 
when continued analytically, is holomorphic for a > —1, and vanishes for 
a = 0. 

In the same way as we did in the previous section with the portion of surface 
S°, we now divide the domain Ds° — D,,° into two parts according as 
0 Sr Séo0ré <r 3S 1. The volume potential relative to the second part is 
again an entire function vanishing for all even integers S 0. Thus we only 
have to investigate the first part. This can be written in the form (cf. (3.1.)) 

gis 


; a7 
Hn(a) «asus J nem — 1) "dr [nee + rc)]o*~*de, 


where ¢s, defined in the previous section, depends on +r and v, og = og(r, 0). 
We set 


Jie + rc)]o" "do = F(r, 0,0), 


where F is well behaved in rt (and v) and holomorphic in a, since ¢ is bounded 
away from 0. Writing, as in the case of a simple layer, H,,(a) = L,,(a) I (8) 
we see by virtue of our findings in the extended one-dimensional case that 


a 3 : B—1 m—2 
L.(a) F(B) f F(r,v,a)r" (1 — r)™ dr 


can be continued analytically as a holomorphic function of a to any a > a, 
where ap is arbitrary. Thus it is a holomorphic function for a > — 1 any way, 
and again, owing to the presence of the factor I'(a/2) in L,,(a), it vanishes 
fora = 0. 

This completes the proof of the fact stated in §3, that J°f(O) = f(O), if 
I® is relative to the original domain Ds°. This can clearly be expressed by the 
more inspiring formula 


(5.1) I’ f(x) = f(x), 





es 











THE RIEMANN-LIOUVILLE INTEGRAL 47 


or by the statement that J° is the identity operator. This is our principal result. 
The passage from (5.1) to the relation 


(5.2) Isf, g, h (x) = f(x) 


follows from the properties of simple and double layers established in the 
previous section. 

We conclude this paper by two additional remarks, the first concerning the 
unlimited analytic continuation of the volume potential, the second concerning 
the case of the infinite cone. 

As indicated on p. 64 of our Acta paper, the unlimited holomorphic con- 
tinuation of the volume potential can be reduced to that of simple layer and 
double layer potentials. 

We consider the wave operator 

a a a 
4 = TD! ~ Tach? ~ *°* ~ Tach: 
(dx) (dx) (dx) 
Then, by virtue of Green’s formula, we have in the notation (1.7) fora > m —2 


(5.3) I*f(x) = ra, Z f(x) 
dn 

(see Acta paper, pp. 46-47). The left-hand side converges for a > m — 2. 
On the right-hand side the volume potential and the simple layer converge 
fora + 2 > m — 2, that is for a > m — 4, while the double layer converges 
only for a +2 > m, that is only for a > m — 2. Thus, seemingly nothing 
is gained as far as the analytic continuation of J*f(x) is concerned. But if 
we take into account our results about the unlimited holomorphic continuation 
of the simple and the double layer and the fact that /***A is holomorphic for 
a > m — 4, the possibility of the holomorphic continuation of J*f(x) down to 
m — 4 is established. The iteration of this procedure, that is the application 
of formula (5.3) to IJ***Af, I***A*f,..., establishes the possibility of an 
unlimited holomorphic continuation of J*f(x). 

In the case of an infinite cone we suppose that f(x) and its derivatives are not 
only well behaved, but also decrease rapidly enough at infinity. In this case 
formula (5.3) reduces to 


(5.4) I*f(x) = I***Af(x), 


and the iteration of this formula gives immediately the possibility of an 
unlimited holomorphic continuation. However, in order to establish the main 
relation I°f(x) = f(x), we still have to go back to §3 and use the double-cone 
D,,°. The treatment of the complementary expression J,;* is in the infinite 
case still simpler than in the finite case. 


University of Lund 
University of Maryland 











CONSTRUCTION OF PRIMITIVES OF GENERALIZED 
DERIVATIVES WITH APPLICATIONS TO 
TRIGONOMETRIC SERIES 


P. S. BULLEN 


1. Introduction. This paper is an extension of the ideas discussed in 
(3, §§ 14-16); the extension consisting of the use of the third and fourth sym- 
metric Riemann derivative instead of the Schwarz or second symmetric 
Riemann derivative. 

The J;-integral, due to James (1), is defined in (3) as follows. Let f(x) be 
measurable on [a, 6] and finite at each point; if there exists a continuous func- 
tion F(x) such that D?F = f everywhere on (a, 5), 


9 ; > oa Se + ad 
D°F = lim F(x + &) 2F(x) + F(x — h) 


a0 h 





r x—b x—a, , 
(1.1) J soa = F(x) — rpg F(a) - eae F(b) = H2(F:a, b, x). 


The definition is unique since if F(x) and G(x) are continuous and D?F = 
D°G everywhere then 


H2(F: a, b,x) = H2(G: a, b, x). 


This integral has application to convergent trigonometric series, (3). 

Using the third and fourth symmetric Riemann derivatives J;- and J; 
integrals are defined and applied to (C, 1) and (C, 2) summable trigonometric 
series. 


2. Definitions. With the notation of Kassimatis, (4), we write for any 
function F(x) defined at the points xj, x2, X3, X4, 


(2.1) Hil P:2u,2nte 2s) = FP) — Pin) S21 — 2a — 2) 
(x3 — X1)(x3 — Xe) 











oe (x4 — x3) (x4 — x1) : (x4 — X2) (x4 — x3) 
F (xs) (x2 — X3)(x2 — x1) Ned) To, — waa — 2) 





; H;(F: x1, x2, X3, : 
(2.2) V3(F: 21, x2, Xs, X4) = 3( sm Xo, Xx = 
(xq — %1) (x4 — X2)(x%4 — X3) 


Received November 6, 1959. This paper was written while the author was a fellow at the 
Summer Research Institute of the Canadian Mathematical Congress. 


48 











PRIMITIVES OF GENERALIZED DERIVATIVES 49 
V; is then the third divided difference of F(x). In particular if h > k > 0 we 
write 
(2.3) ws3(F:x;h, k) = wa(x:h,k) = 31V3(F: x +h, x +h, x — k, x —h) 


__ 8 OS F(e+h)— F(x —h) F(x +h) — F(x —k)\ 
_ Poel h k fs 


(2.4) we;(F:x; 3h, h) 











_ A\(F:2h) _ F(x + 3h) — 3F(x +h) + 3F(x — h) — F(x — 3h) 
~ (ay (2h)° 
From (2.3) and (2.4) we define 
(2.5) A'"F(x) = lim ws(x;h,k), 8" F(x) = lim ws(x:h, k), 
a,koo0 aA.ko0 
(2.6) D* F(x) = lim w;(x:3h,h), D° F(x) = lim w(x: 3h, h), 
avd a0 


and if D*F(x) = D*F(x) we say that F(x) has a third symmetric Riemann 
derivative at x and write it D*F (x). 

Clearly 
(2.7) a"F << DF< DF < A’ F. 

The following lemma, which generalizes Theorem 19, (3), is needed later. 


LemMa 2.1. Jf F” exists in an interval containing x and if A,(6;) is the greater 
(smaller) of the first derivates of F’’ then 


(2.8) 4< 8" € A” € At 


All points will be assumed to be interior to the interval mentioned in the 
statement of the lemma. It is sufficient to prove 6; < 6 as a similar argument 
will complete (2.8). Further we may obviously assume 6, > — @. The proof 
is in two parts. 

(a) Assume 6; < ©. From the definition of 6;, if « > 0 is given, there exists 
u > O such that if 0 < 9, € < uw then 


F"'(x +) — F’(x) > n(éi — ©), 
F(x — §) — F’'(x) < &(6; — ©). 
Consider the function X (u) defined by 


2 3 
- = F(x) — = (3, = «). 


The following properties of X(u) are immediate, 


X(u) = F(x + u) — F(x) — uF’ (x) 


X'(u) = F(x + u) — F'(x) — uF"(x) - a (5; — «), 


X"(u) = F"'(x + u) — F'(x) — u(bi — ©), 
X(0) = X’(0) = X”"(0) = 0, 














50 P. S. BULLEN 


X"(u) >On O0O<u < yg, 
X"(u) < Oif —up <u <0, 
(2.10) w3(X:0;h,k) = w3(F:x;h, k) — (6; — ©). 
It follows, from (2.10), that it is sufficient to show that, for all A, k small 
enough, w;(X:0; 4, k) > 0. To do this define, for 0 < u < uy, 
X(u) — X(—u) 


u 


(2.9) 


Y(u) = 
Then by (2.3) 
. 3 , . 
w;(X:0;h, k) = ;—F (Y(h) — Y(R)). 


If, therefore, Y(u) is monotonic increasing for all u small enough, then 
w;(X:0; h, k) > 0. Now 


Y’(u) = — *UX(u) — uX'(u)} — {X(—u) — (—u)X’(—4)}] 


— + [Z(u) - 2(-u)] 


where Z(u) = X(u) — uX’(u). Z(u) is clearly defined wherever X(u) and 
X'(u) are defined and 


Z'(u) = — uX"(u) < 0 


by (2.9). Hence Z(—u) < Z(u) and hence Y’(u) > O wherever Y’(u) is 
defined. 

Thus we have shown that Y(u) is monotonic increasing and the result 
follows. 

(b) Assume 6, = ©. Then in the above argument replace 6, — « by an 
arbitrary positive number A to arrive at 6°” = 6, = . 

The following lemma due to Saks, (5), will be required later. 

LEMMA 2.2. If F’(x) exists everywhere in [a, 6] and D*F > 0 in (a, 6) then 
F’ (x) is continuous, convex, and 
(2.11) A’ F(x: 2h) > 0 
for weryx,h >0 (a cx — 3h <x + 3h < BD). 


3. We now wish to define a class of funtions for which D*F = D®G every- 
where in an interval implies H3(F: x1, x2, x3, x4) = H3(g: x1, X2, X3, x4) for all 
sets of four points in that interval. As has been pointed out by Kassimatis, 
(4), continuity of F and G is not enough. 


LemMa 3.1. If F’(x) exists and is continuous in |a, 6] then 


(3.1) min D* F(x) < 3! V3(F: x1, x2, x3, X4) < max D'* F(x) 


a<z<b > a<z<b 


for all x, x2, x3, x4 in [a, 5}. 


for 


for 


th 


ee 








PRIMITIVES OF GENERALIZED DERIVATIVES 


The argument is that of Verblunsky, (6). Define 
(3.2) f(x) = F(x) — (ax* + dbx? + cx + d) 
where a, 6, c, d are determined by the conditions 
f (x1) = f(x2) = f(xs) = f(x) = 0 
for some Xj, X2, X3, Xin (a, 6). Simple calculations then show that 
a = V;3(F; x1, X2, X32, X4). 


Since f(x) has four zeros, f’(x) has three zeros and a maximum at some point 
=, say. Then we must have 


(3.3) w;3(f:;3h,, hk, <0 
for a sequence of h,, h; — 0. For if not, then for all 4 small enough 
3 (f: &: 2h) 
7K. 0 
“ * 
that is, 


f(E + 3h) — f(E — Bh) . SE + h) — FE — bh) 
6h 2h 
ere Fs 
which contradicts the fact that f’(x) has a maximum at &. As we have (3.3) 
it follows that 





Dif(t) < 0 
that is, 
DF(t) < 3!a = 3! Va(F: x1, x2, x3, 4). 

This proves the left-hand inequality of (3.1). The right-hand inequality 
comes from applying the above result to —F. Finally the result holds for 
X1, X2, X3, X4, in [a, 6] by continuity of F(x). 

An immediate corollary of Lemma 3.1 is 


LEMMA 3.2. If the relation (3.1) holds for F(x) — G(x), in particular i 
(F — G)’ is continuous, then D® (F — G) = 0 imblies 


(3.4) H;(F: x1, X2, X3, Xs) = H3(G: x1, X2, Xs, X4) 


for all x1, x2, %3, x4 in [a, 6}. 


Let 
FP, (x) = H;(F: X1, X2, X3, X) 
Gi(x) = H3(G: x1, x2, X3, x). 


Then D*(F, — G,) = 0 and hence, by (3.1), 
V3(F; = Gi: V1, Vo, V2, V4) = () 








52 P. S. BULLEN 


for all y,, Yo, ¥3, Ys. F(x) — Gi(x) is therefore a polynomial of degree at most 
2 but it is zero at x;, x2, x3; and hence is identically zero. 

The following lemma, due to Kassimatis, (4), obtains (3.4) under weaker 
conditions than the continuity of (F — G)’ but it is quite possible that (3.1) 
holds under less restrictive conditions which would then generalize Lemma 


3.2. 


Lemma 3.3. If F(x) and G(x) are defined in |a, b] and (i) F — G is continuous 
in [a, 6], (ii) (F — G)’ exists in (a, b) then D®(F — G) = 0 implies (3.4). 


4. The /J;-integral. Let f(x) be defined and measurable on [a, 6] and finite 
at each point. If there exists a function F(x), continuous on [a, 6] and differen- 
tiable on (a, 6) such that D*F = f then we define the J;-integral of f to be 

z 
(4.1) f f (t)dst = H;3(F: x1, x2, X3, Xx) 
z 


1.22.23 
where x), X2,X3,x are any four points of [a, 6] and a < x; < x2 < x3 < 3. 
Lemma 3.3 ensures that this definition is unique. 
If f(x) is complex valued and f(x) = u(x) + w(x) then we define the J;- 
integral of f(x), if it is defined for both u(x) and v(x), by 


f f(t)dst = f u(t)ds a 1 v(t)d3(t). 
21,272,223 71.22.23 21,272,273 


The following elementary properties of this integral are immediate. 
(a) If f(x) and g(x) are J;-integrable on [a, 6] so is af(x) + 8g(x) for any 
numbers a, 8 and 


z 


f laf(t) + Bg(t)}dst =a f f(t)dst + B g(t)dst. 


1.22.23 21,22,23 


(b) If f(x) is J3-integrable on [a, 5] it is also /;-integrable on any subinterval 


[a, 8] and if 


F(x) = | f(jd3 thenifa<<i< B and a<x<B 
7.d 


a, 


| f (t)dst = H;(F: a, 6, B, x). 
a,é.8 


5. Application to trigonometric series. 


THEOREM 1. Let f(t) be the (C, 1) sum of the series 


and let 


(5.1) 


S 
= 
~ 
~~ 
| 
co 
2 
> 


See ee 


the: 








—— ee - ( 


So eee ee 


~-- 


PRIMITIVES OF GENERALIZED DERIVATIVES 


(5.2) | f()dst = H;3(F: x4, X2, X3, X). 


= 


To obtain this result we need the following lemma (6, II, p. 69). 


LEMMA 5.1. “Jf 


@ 
> ce™, co=0, 
mad 


is summable (C, a), a > —1 at xo to s then it is summable R, at xo to s provided 
r>1+a. By this we mean that 


; = sinnh\’ 
lim c pme(sin sb) = ys,” 
h>0 p> . inh 


The result as stated in (8) requires a > —1 but it is in fact true when 
a =_—1 when it is the result of (8, 1, p. 322). 
Simple calculations give 


2h 


Fix +h) —F(x-h) GG m( sin 28) 
> , inh 

Pie + 8) — 90h) + Pin 9). fh & find) 
(2h)° ai n° inh 


F(x + 3h) — 3F(x + h) + 3F (x — h) — 3F(x + 3h) _ > Ca oon sn eh) ? 
(2h)* = > . 


1 inh 





int 


By hypothesis the series >> c,e"‘ is summable (C, 1) and hence the series 


pe Cn yin and pe a 


are summable (C,0) and (C, —1) respectively. 

Hence by Lemma 5.1 D*F = f everywhere and D?F and D F exist and this 
implies the existence of F’(x), (4). This then proves that f is J;-integrable and 
gives (5.2). 


THEOREM 2. /f 
— int 
pm Cn 
is summable (C, 1) to f(t) then 


. 3 0 a 
(5.3) a=-s4 f(pjer™* dat. 
8x —4n,—29, Ie 
We first calculate co. In Theorem 1 we assumed for simplicity that co = 0 
but this clearly involves no loss in generality. 
Hence we know that 











54 P. S. BULLEN 


ce 
f f()dst = H;(F: —4x, —22, 22,0) 
—4a,—2e, Qe 


= F(0O) + ; F(—44r) — F(-—2n) - ; F(2n), 


where 


nz 


. cox 

F(x) = 3" + 

( Y & Gay 

Since the last term on the right-hand side is periodic its contribution to the 
integral is zero. Therefore, 


0 
. _ 1 & om 3 6 & _9 — Co 9 3 
J. fwas =35 (—472) 6 (—27) 3 6 (2)°. 
82° 
= — 3 Co 


To calculate c,, 2 > 0, requires f(x)e"” to be expressed as the (C, 1) sum 
of a trigonometric series with constant term c,. This has been done by James, 
(2), and then a similar calculation to the one above completes the proof of 


(5.3). 


6. Construction of the /;-integral. The J;-integral can be constructed 
by methods used in Jeffery, (3), to construct the J-integral. 


THEOREM 3. Let f(x) € L(a, 6) and let f(x) be the finite third symmetric 
Riemann derivative of a function continuous in |a, b| and differentiable in (a, b). 
Further let 


(6.1) O(x) = J f | f(t) dt dv du, 


then 


(6.2) f Sf (t)dst = H3(®: x1, X3, x3, x). 


1.22.23 

Since the construction follows the lines of (3) it is only sketched here to 
point out certain differences. 

We first determine a sequence of continuous functions U,(x) such that 
D*U,,(x) > f(x) and which converges uniformly to (x). 

As in (1) define A, (x) such that, with the notation of Lemma 2.1, 6,4,(x) > 
(x) and A, (x) converges uniformly to f* f(t) dtasn— ~., 

Then the required U,(x) is 


U,(x) = fa f A, (t) dt 


which clearly converges uniformly to ®(x) and, from the continuity of A,(x), 


= f A, (t) dt, Un’ (x) = A, (x). 
0 


~ 


ee nl 
—_——- 


fo 


W! 


Ww 



























PRIMITIVES OF GENERALIZED DERIVATIVES 


By Lemma 2.1 
5°" U, (x) > 6,U, = 6A, > f(x) = D®* F(x) 


where F(x) is some function continuous on [a, 6] and differentiable on (a, 6). 
Hence 


' D*(U,(x) — F(x)) > 0 
and so, by Lemma 2.2, 
A®(U, — f)(x: 2h) > 0 
forallx,kA >O,a <x — 3h <x+3h < b 
Hence letting » € @ 
A®(@ — F)(x 2h) > 0 
which implies 
Di(@ — F) > 0. 
; ae In a similar manner it can be shown that 
D’(@ — F) <0 
which together with the previous inequality implies 
D*(@ — F) = 0. 
From Lemma 3.2 this gives 


H;(F: x1, X2, X3, X%4) = He(@: x1, Xo, Xs, X4), 


A function is said to be Lebesgue integrable at a point x» if it is Lebesgue 
integrable in every sufficiently small neighbourhood of x». 

As in (3) the above result can be extended to functions f(x) which have a 
finite number of points at which they are not Lebesgue integrable. This can 
be done provided only that if 8 is such a point, then 


A(x) = f fro dt du 


is Denjoy integrable in some interval (a, y) containing 8. 


completing the proof of the theorem. 


7. The fourth symmetric derivative. We now indicate the definitions 
and results in the case of the fourth symmetric Riemann derivative. As in 
§ 2 we define 

(x5 — X1) (Xs — X2) (xs — £3) 


(7.1) H4a(F: x1, x2, Xs, X4, Xs) = F(xs) — F(x4) ————_— 
(x4 — X1) (xq — X2)(X%q — X3) 








| — F(x3) (xs — x4) (xs — 1) (xs — x2) 


(x3 — X4) (x3 — X1) (x3 — X2) 


_ F(xs) #8— X3) (x5 — X4)(%5 — %1) | 
. (x2 — X3) (x2 — x4) (x2 — %1) 


(xs — X2) (xs — X3)(%s — X4) 
(x1 — X2)(x1 — X3) (xy — £4) 





F (x) 














56 P. S. BULLEN 





(7.2) Va(F: x1, x2, xy ey 4) = et Em En Fe 0) 
(x5 — X1) (x5 — X2) (x5 — X3) (x5 — x4) 





It may be noted in passing that the function f(x) in (3.2) is equal to 
H,(F: x1, Xa, Xa, X4, X). 
In particular if h > k > 0 we write 


(7.3) wi(F:x;h,k) = 4! Vi(F:x +h,x +h, x —k,x —h) 
12 {F(x +h) — 2F(x) + F(x —h) F(x +k) — 2F(x) + F(x k) 














a ee | h? , peeeeeee 
4 . . 
(7.4) 4 ee h) = w,4(F: x; 2h, h) 
_ F(x + 2h) — 4F(x +h) + 6F(x) — 4F(x — h) + F(x — 2h) 
. h' oo 
Using (7.3) and (7.4) we define 
(7.5) A‘ F = lim w,(h, k), 8 F = lim w,(h, k) 
h,k+0 h.k0 
(7.6) D‘F = lim w,(2h, h), D‘F = lim w,(2h, h), 
a+0 ° h0 


and if D‘F = D*F we say that F has a fourth symmetric Riemann derivative 
and write it D*F. Clearly 
(7.7) 5" RF ¢< DF < DF < AF, 
and we have the following lemmas. 

LemMA 7.1. If F’” (x) exists in an interval containing x and if A;(6;) is the 
greater (smaller) of the first derivates of F then 
(7.8) 6, < 6 ¢€ AC™ € Ay. 

LeMMA 7.2. If F’’ (x) exists everywhere in [a, b| and D*F > 0 in (a, b) then 
F(x) is continuous, convex, and 
(7.9) A‘ F(x:h) >0 
for weryx,h >O0 (a <x—4h<x4+4h < BD). 

The proof of Lemma 7.1 is very similar to that of Lemma 2.1. Making all 
the obvious changes define 


2 3 4 
X(u) = F(x + u) — F(x) — uF'(x) — ~ F'(x) — = F’'"'(x) — =. (6; — e). 


~ 3! 4! 
As in the previous proof it is sufficient to show that w,(X:0;h, k) > O for 
all h, k small enough. 
Defining 





: X(u) — 2X(0) + X(—u) _ X(u) + X(—x) 
Y(u) = face te 


2 2 
u u 








PRIMITIVES OF GENERALIZED DERIVATIVES 


or 


7 


it is sufficient to prove Y(u) to be monotonic for u small enough, u > 0. 
Then if we define 


Z(u) = 2X(u) — uX'(u) 


it is sufficient to show that Z(u) has a local maximum at u = 0. This follows 
since Z’(u) is monotonic decreasing, Z’’(u) being —uX’”’(u) which by a result 
similar to (2.9) is always negative. 

The proof of Lemma 7.2 is exactly similar to that of Lemma 2.2 owing to the 
reasons given by Verblunsky in (7). 


LemMa 7.3. If F’’(x) exists and is continuous in [a, 6] then 
(7.10) min D‘F(x) < 4’ Va(F: x1, X2, X3, X4, X8) <— Max D‘F(x) 
a<z<b a<azmd 


for all x1, X2, X3, X4, Xs im [a, 6]. 


LemMMA 7.4. If (7.10) holds for F(x) — G(x), in particular if (F — G)" is 
continuous, then D*(F — G) = 0 implies 


(7.11) Hi,(F: x1, X2, X3, X4, Xs) = Hya(G: x1, X2, Xs, X4, Xs). 


LemMaA 7.5. If F(x) and G(x) are defined in |a, b| and (i) (F — G) is continuous 
in |a, bj, (ii) (F — G)” exists in (a, b) then D*(F — G) = 0 implies (7.11). 


As the proofs of the corresponding lemmas 3.1, 3.2, and 2.3 depend on (6) 
the proof of these are exactly the same but are based on (7). 

Now let f(x) be defined at each point and measurable. If there exists a 
function F(x), continuous on [a, 6] and with a second derivative on (a, 6) such 
that D‘F = f, we define the J,-integral of f to be 


z 

(7.12) f f(t)da = H,(F: x1, X2, Xa, X4, x) 
71,.22,.73.74 

where x), X2, Xz, X4, x are any five points of [a, b]} anda < x; < x2 < x3 << x4 < 

b. The discussion of § 4 applies with obvious changes to this definition. Further, 

the following theorems can be proved. 


THEOREM 4. Jf 


oo 


t 
 & Cn€ " Co - 0, 


—x 


is (C, 2) summable everywhere to f(t) and c, = o(n) and if 


int 
= A Cn€ 
(7.13) F(t) = >> Gm) 
then 
(7.14) f f(t)dat = Ha(F; x1, x2, X3, X4, X) 


where x; < X_2 < X3 < X4, x are any five numbers. 











58 P. S. BULLEN 
THEOREM 5. Jf 
~ tnz 
a Cne ’ 
has Cy, = o(n) and is (C, 2) summable to f(t) then 
CG = ra fie “dg 
‘ 8x —4n,—2r, le, 4 sais 


THEOREM 6. Let f(x) be the finite fourth Riemann symmetric derivative of a 
function continuous on [a, b| and with a second derivative on (a, b). Let f(x) 
be Lebesgue integrable except at a finite number of points 8,,..., 8,. Further 


suppose that 
zr 7 wu 
ax) = ff f | S(t) dtdudy 4=1,2,...,m 


is Denjoy integrable in some interval (a;, y;) containing 8,;. Then if we define 


sc fy fu t 
(x) -{ | | f f(t) dt du du dy 


f f(t)dd = H,(®: x1, x2, X3, X4, X). 


21,22,23,74 


then 


REFERENCES 


R. D. James, A generalized integral II, Can. J. Math., 3 (1950), 297-306. 

R. D. James, Summable trigonometric series, Pac. J. Math., 6 (1956), 99-110. 

R. L. Jeffery, Trigonometric series (Toronto, 1956). 

C. Kassimatis, Functions which have generalized Riemann derivatives, Can. J. Math., 3 (1958), 
413-420. 

5. S. Saks, On the generalized derivative, J}. London Math. Soc., 7 (1932), 247-251. 

6. S. Verblunsky, The generalized third derivative and its application to the theory of trigonometric 

series, Proc. London Math. Soc. (2), 31 (1930), 387-406. 
. S. Verblunsky, The generalized fourth derivative, J}. London Math. Soc., 6 (1931), 82-84. 
8. A. Zygmund, Trigonometric series (2nd ed.; Cambridge, 1959), I, II. 


PPP > 


| 


~ 


nwversity of British Columbia 


of a 
f (x) 
ther 


fine 











A NOTE ON NON-NEGATIVE MATRICES 
C. R. PUTNAM 


1. Introduction. This note can be regarded as an addendum to the 
paper (4). On the complex Hilbert space of vectors x = (x, x2, ... ,) a matrix 
A is said to be bounded if there exists a constant M such that ||Ax|| < M]|zx|| 
whenever ||x||? = >>|x,|? << ©; the least such M is denoted by ||A||. Only 
bounded matrices A and vectors x satisfying ||x|| < © will be considered in 
the sequel. The spectrum of A, denoted by sp(A), is the set of values for 
which the resolvent R(A) = (A — AJ)~ fails to be bounded. The notation 
A >0O or A >0O, where A = (a;,), means that, for all i and j, ay, > 0 or 
ay; > 0 respectively. There was stated in (4) the following theorem (also 
contained in some results of Bonsall, cf. the references cited in (4)) generalizing 
results of Perron and Frobenius for finite matrices: 


(1) If A > 0, then uw = sup |A|, where d is in sp(A), also belongs to sp(A). 


The proof in (4) of this theorem is not correct for arbitrary bounded A > 0, 
although it is valid for any such matrix with a spectrum identical with the 
set of (function theoretical) singularities of its resolvent, that is, with the 
set of singularities of at least one element of the resolvent. However, although 
the spectrum always contains this latter set, there exist bounded matrices, 
even satisfying A > 0, for which a number can belong to the spectrum and, 
at the same time, be an analytic point of each element of the resolvent. Such 
a matrix is given by B = (b,;) where 6,, = 1 or 0 according as j is, or is 
not, i + 1. In fact, B > 0, sp(B) is the unit disk |A| < 1, and 0 is the only 
singularity of R(A) (see (6, p. 145)). In view of this circumstance, an alternate 
simple proof of (I) will be given in § 2 below. 

A few remarks relating to (4) will be made in § 3. In § 4, generalizations 
of certain theorems stated in a recent paper of Birkhoff and Varga (1, pp. 
356-357), will be given. 


2. In order to prove (I), it is sufficient to show that R(A) is bounded 
whenever R(|A|) is bounded. Now R(A) is given by R(A) = — 3(A*/A"*! when- 
ever |\| is sufficiently large, in fact, whenever |A| exceeds the spectral radius, 


u( = lim \l4"||""), 


Now 


Received July 7, 1959. This research was supported by the United States Air Force through 
the Air Force Office of Scientific Research of the Air Research and Development Command, 
under Contract No. AF 18 (603) - 139. Reproduction in whole or in part is permitted for any 
purpose of the United States Government. 


59 














60 Cc. R. PUTNAM 


of A (cf., for example, (5, p. 421)). Since A > 0, also A* > 0, and so if 
R(A) = (Ri,(A)), it is clear that, for |A| > uw, |Rij(A)| < — KR, (Al). If 
x = (x1, %2,...,) and X = (|x|, |xa|,...,), then ||x|| = ||X|| (< @). Conse- 
quently, §=||R(A)=x||*? = Dd sRuAxs)* < ViCVsRis(/A))|x4))? = || ROA) XII? 
< ||R(JA})||*I|x]|?, amd so ||R(A)|| < |/R(A])|| whenever |A| > uw. But if yu is 
not in sp(A), then ||R(x)|| < ©, and it follows from the continuity of ||R(A)|| 
on the complement of the spectrum, and from the fact that ||R(A)|| -~@ 
whenever A is not in sp(A) and tends toa point of sp(A), that ||R(A)|| < ||R(u)|| 
<o for |A| = uw. Since sp(A) is closed, this last inequality implies that the 
spectral radius is less than yw, a contradiction, and the proof of (1) is now 
complete. 








3. The third theorem in (4) can be stated as 


(Ill) If A > 0 and if uw of (I) is a pole of the resolvent R(X) = (A — XI)“ 
then there exists a characteristic vector x > 0 of A belonging to pw, thus Ax = ux 
(x 0). 


Actually it was assumed in (4) that u should be positive; the proof given 
there, § 5, makes it clear, however, that this need not be assumed. If A is 
real and satisfies 4 > w, then R(A) = — }A*/A"*! <0, and the matrix 
inequality c_y < 0 (NV > 1, c_y # 0) needed in the representation 


R(A) = > a(A — »)” 


n=—N 
of (4, § 5), is still assured. 

Whether the assumption in (III) that » be a pole of R(A) can be weakened 
to the (implied) condition that » be an isolated point of sp(A) and belong 
to the point spectrum will remain undecided. It is even conceivable that 
only the assumption that u be in the point spectrum is needed in the hypothesis 
of (III). 

Incidentally, the statement of (3), and mentioned in (4), that if A >0 
and is completely continuous, and if the diagonal elements of every power A" 
are zero, then zero is the only point of sp(A), surely cannot be true if the 
assumption of complete continuity is omitted, as the matrix B cited earlier 
in this paper shows. 


4. Generalizations of certain theorems for finite matrices stated in a recent 
paper of Birkhoff and Varga (1, pp. 256-257), will be given in this section. 
Corresponding to the terminology of (1), a matrix A will be called non- 
negative or positive according as A > 0 or A > 0, essentially non-negative 
if a,; > 0 for 1 ¥ j, and irreducible (also indecomposable, cf. the references 
cited in (1)) if, for any i and j, there exists a finite sequence i = k(0), k(1),..., 
R(N) = j for which ajin—1) xm) ¥ 0 for Rk = 1,2,..., N. A vector x = (xj, x2, 

.,) will be called non-negative or positive according as x; > 0 or x, > 0 
for all 7. 





ed 
ng 
lat 
SiS 





een dl 


— ~S eee 


NON-NEGATIVE MATRICES 61 


(i) If A is essentially non-negative then v = max Re(sp(A)) is in sp(A); 
moreover,v > Re(d) of X # vand dts insp(A). In case v is a pole of the resolvent 
R(A) = (A — Al)", then A has a non-negative characteristic vector x belonging 
lo v. 


In fact, (i) follows readily from (1) and (III) if these latter theorems are 
applied to the matrix C = A + al which is non-negative if @ is positive and 
sufficiently large. It is to be noted that the resolvent of C is given by R(A — a). 

Furthermore, 


(ii) If A is essentially non-negative and irreducible and if v of (i) is a pole 
of R(A), then (a) v is a simple pole of R(A), (b) v is a simple characteristic 
number, and (c) there exists a positive characteristic vector x of A belonging to v. 


Assertion (ii) follows from (IV) of (4), namely, 


(IV) If C>0, tf for every pair, 1, j there exists an integer M = M(i, j) 
such that (C™“),; > 0, and if uw of (1) is a pole of R(A) = (C — AI)~"', then 
(a) uw is a simple pole of R(A), (b) uw ts a simple characteristic number, and 
(c) there exists a characteristic vector x > 0 belonging to u. 


In order to see this, let (IV) be applied to C = A + al, which is non- 
negative for a positive and sufficiently large, and note that the condition 
(C”),,; > 0 for some positive integer M = M(i, 7) is a consequence of the 
present assumption of irreducibility of A, provided that a@ is sufficiently 
large. (In this connection for finite matrices, see (2, p. 20)). For, let a > 0 
be chosen so large that the diagonal elements c,, of C are positive. Since 
Cy = Ay, if i # j, it is then clear that the irreducibility of A implies that 
of C. Consequently, since C > 0, there exists for any pair i,j a positive 
integer M = M(i,7) and a finite sequence i = Rk(0), R(1),..., k(M) = j 
such that 


M 
d= [] Cem—v2m > 0. 


n=1 


But (C™),, is given by a sum of non-negative terms one of which is d and so 
(C”),, > 0. Thus, as remarked above, (ii) follows from (IV) of (4). Inci- 
dentally, the above argument makes clear that a non-negative matrix, here 
C, satisfies (C™”),,; > 0 for every pair 7, j and some positive integer M = M (i, 7) 
if and, in fact, only if, it is irreducible. 








62 Cc. R. PUTNAM 


REFERENCES 


- G. Birkhoff and R. S. Varga, Reactor criticality and nonnegative matrices, J. Soc. Indust. 
Appl. Math., 6 (1958), 354-377. 

2. I. N. Herstein, A note on primitive matrices, Amer. Math. Monthly, 61 (1954), 18-20. 

3. M. G. Krein and M. A. Rutman, Linear operators leaving invariant a cone in a Banach 
space, Uspehi Matem., Nauk (N.S.), 3 (23) 1948), 3 95; Amer. Math. Soc. Trans- 
lation No. 26 (page reference in paper refers to this translation). 

4. C. R. Putnam, On bounded matrices with non-negative elements, Can. J. Math., 10 (1958), 
587-591. 

5. F. Riesz and B. Sz-Nagy, Legons d’analyse fonctionnelle (Budapest, 1953). 

6. A. Winter, Spektraltheorie der unendlichen Matrizen (Leipzig, 1929). 


Purdue University 











—_——_———— a 





SUMMABILITY METHODS ON MATRIX SPACES 
JOSEPHINE MITCHELL 


§1. Introduction. The matrix spaces under consideration are the four 
main types of irreducible bounded symmetric domains given by Cartan (5). 
Let s = (z) be a matrix of complex numbers, 2’ its transpose, z* its conjugate 
transpose and J = J™ the identity matrix of order m. Then the first three 
types are defined by 
(1) D = [z|I — zz* > 0}, 


where z is an m by m matrix (m < m), a symmetric or a skew-symmetric 
matrix of order m (16). The fourth type is the set of complex spheres satisfying 


\s’s]| < 1,1 — 22*2 + |2’2|? > O, 


where z is an m by 1 matrix. It is known that each of these domains possesses a 
distinguished boundary B which in the first three cases is given by 


(2) B = [u\uu* = J}. 


(In the case of skew symmetric matrices the distinguished boundary is 
given by (2) only if m is even.) 

In § 2 we consider the following problem for the first type of domain with 
m = n, in which case u is a unitary matrix, the (real) dimension of B is n* and 
of D is 2n*. Let f(u) be a real integrable function defined on B and consider 
the integral operator 


(3) I(f, 2) = f P(z, u)f(u) dV, 
B 
where P(z, u) is the Poisson kernel (14) 
(4) P(z,u) = V~‘det"(I — su*)~*(I — 22")(I — us*)™, 


V is the Euclidean volume of B, and dV the Euclidean volume element. It 
is known that J(f, z) is a harmonic function of z if J — zz* > Oor J — 2z2* < 0. 
(I proved this fact in (14) for z € D but the proof is valid for all z andu € B 
for which det(J — zu*) # 0. It is easily proved that det(J — zu*) # 0 for 
u € B and all z such that J — zz* > 0 or J — 22* < 0.) Here a harmonic 
function is a function of class C?, which satisfies on D the Laplace equation 
corresponding to the invariant metric of D, that is, the metric invariant with 
respect to the group of 1 to 1 analytic transformations mapping D onto itself 
(14). This invariant metric is given by 


ds? = o[(I — 22*)—'dz(I — 2*2)—'dz*], 


Received November 30, 1959. Presented to meetings of American Mathematical Society on 
January 29, 1960 and January 24, 1961. 


63 








64 JOSEPHINE MITCHELL 


where (A) is the trace of the matrix A and dz = (dz), and the corresponding 
Laplace equation is 
4o[a(I — 2*z)a’(I — 22*)] = 0, d = (0/dz,). 

It has been proved by Hua and Lowdenslager that given a real function f, 
continuous on B, there exists a function F, harmonic on D, such that F(z) 3 
f(uo) as z—> ue € B radially, that is, along the set pu, 0 < p < 1 (6; 9). 
Further, if F is continuous on the closure D of D and satisfies certain other 
conditions due to Lowdenslager on the boundary of D other than B, then F 
is unique (8). Now for the particular case of the unit circle, 22 = 1, if we merely 
assume that f is integrable on it, then /(f, puo) ~ f(uo) as p> 1(0 < p < 1) 
(20); this method of approach is known as A bel—Poisson summability of Fourier 
series. We prove this result for matrix spaces. (See note added in proof). 

In §3 we consider for the first type of domain (m < m) some properties of 
complete orthonormal systems (CONS) of complex homogeneous polynomials 
defined on D. The space D is circular with center at z = 0, that is, if z € D, 


then ez € D for 0 < 6 < 2x. Hence any two powers 
4 s t 
(5) P(z) = [J 2é*, Q(z) = T] 2,2", 
gk jk 
S jx, tj NON-negative integers, for which 
(6) a S jx = DW tins 
jk jk 


are orthogonal, that is, 
(7) (P,Q) = J P(z)Q(z)dW = 0 
D 


(dW is Euclidean volume element on D) (6). Also if f € class L? on D, which 
means that f is single-valued and analytic on D and has finite norm |/f|| = 
((f, f)]}* (2), then the set {P} is complete with respect to functions of class 
L* (6). 

Here we refine conditions (6) to show that (P, Q) # 0 implies 


8) YS sa=d te YLsy= Do te (= 1,..., 8:6 =1,...,m). 
k k j 


J 
By means of (8) the set of powers { P} is subdivided into disjoint subsets whose 
members need not be orthogonal to each other. The elements of a subset are 
made into an orthonormal set by the Gram Schmidt formulas, thus giving a 
CONS of homogeneous polynomials {¢} on D. We note that Hua has con- 
structed a CONS of functions of class L* on D using representation theory 
(6). 

In § 4 applications of the CONS {¢,} are given. First an Abelian theorem 
is obtained and then a Tauberian theorem for the orthogonal series }-a,¢, 
as z approaches [/, 0] of B along the matrix [r, 0] where r is the diagonal matrix 
[ri,...,%], O< 7; <1. Next a Cauchy’s inequality is obtained for the 
Fourier coefficients a,. Finally two mean value theorems, which generalize 
analogous theorems for the unit circle, are proved. 


—— 


> 





























SUMMABILITY METHODS 


ing §2. Poisson summability. 


1. Reduction of integral (1.3) to normal form. Rauch outlined this reduction 
to me. The transformation 

















nf, (1) w= 2ua', uous = I, 
9). : takes z = uw» into w = J and also leaves D and B invariant since under it 
her I — ww* = I — zz*. Also if u — v under (1) P(z, u) — P(w, v) and 
iF IV u a(v) v 
¢ 7 pr wa 
ely det"u 0(u) det’v det'uo 
1) } 
where 
rier j —}n(n+1) 
u= (-1)‘ I] du 
. k 
5 ol . . : . Tr 
calle (14). Now du = dv uo, the Jacobian of which is (u)/A(v) = det"u» (3). Thus 
D dV,,—dV,. Also f(u) — f(vuo) = fo(v) so that I(f, z) + I(fo, w). 
} If w — I along the set of points pJ (0 < p < 1), then 
t 
det(J — ww*) = (1 — p?)" 
and 
QO = UI — w*)( — vw*) = 1 — w* — vw* + wo* 
= I(1 + p*) — p(v + 2%). 
Now » is unitary equivalent to a diagonal matrix vp which is also unitary (10, 
Theorem 41.41), that is, 
} (2) v= U*vpU. 
ich Thus if vp = [d;,..., d,|, then dd, = 1 and we can write d, = e"1(0 < 6, < 
a 27). Hence 
lass : 1. 7 . , 
Q = [ (70 + p*) - p(vp + vp)|l 
and 
m). } det Q = det[/(1 + p) _ p(vp + vp)] = I] (1 — 2p cos 6; + p). 
j=l 
10Se Let 
are — 
P v 
y (¢ ) ’ 6; -_ ay a» 
ig a , P(p, 65) 1 — 2pcos0;+ p 
‘on- 
Ory which is 27 times the Poisson kernel for the unit circle. Then 
(4) P(pI,v) = VT] p"(o, 45) 
rem j=l 
1, and 
trix . 
the (5) I(fo, pI) = v{ I] bP" (p, 9;)folv)dV,. 
ilize Bj 


According to Wey! (19, p. 197) if (2) holds, then 
(6) dV, = |dVy|AAd6, . . . d6,, 











66 JOSEPHINE MITCHELL 


(7) AK = [J |e’ —e|? = 4 [] sin} (6, — &) 


and dV» = [dVy] is a constant times the Euclidean volume element on’ the 
other m? — n parameters defining B. Let Bo be this part of B. Now P(pl/, v), 
considered as a function of v, is independent of the other m* — n parameters 
and hence 


(8) I(fo, pI) = in ses f Pr. v)F(0)AAd#, . . . d0,, 
where 
(9) F(6) = F(@,,...,9) = J oraVo. 
2. Convergence theorem for (8). It is sufficient to consider (8) for n = 2, 


in which case replacing @; by s and @2 by t 
2r 2 


(10) I(fo, pI) = av f b’(p, s)p"(p, t)sin"}(s — t)F(s, t)dsdt 
0 0 


and we consider lim, ,,J(/o, pJ). Subtracting S from each side we reduce (10) 
by well-known methods in Fourier series (20) to a consideration of 


(11) [ = av f J 2%. s)p(p, t)sin’4(s — t)G(s, t)dsdt 
0 0 

and 

(11a) I, = av f | b"(p, s)p"(p, t)sin”(s + t)G(2x — s, t)dsdt, 
0 0 


where 
(12) G(s, t) = F(s, t) + F(2x — s, 2x — t) — 2V,S, 


V> being the volume of Bo. We show that under certain hypotheses on G(s, £), 
I — 0 as p— 1 and the proof is similar for J,. In (11) integrate by parts with 
respect to s and ¢. Since 


H(s,t) = ff G(s, t)dsdt 


is zero for s = 0 or ¢ = 0, sin?}(s — #) = 0 for s = t = x, sin*}(x — 6) = 
cos*40 and p(p, x) = (1 — p)/(1 + :), we get 


os di ay 
an ' (: +p 
1 —p 2 La a . es 
_ (2 + s) J Os [p (p, s)cos $s|H(s, x )ds 


Ld Ls 2 
+ J, Ss Ip*(o, s)p*(o, t)sin*}(s — t)|H(s, t)dsdt 
mes I; + Is + T3. 


f 2 [p"(p, t)cos $t]H (x, t)dt 
o Ot 








Th 


(14 


Als 


wh 






SUMMABILITY METHODS 


The following inequalities are used in considering J,, I2, and J; (20): 


: 1 
| (4) @) — 0< po, < +4 
| (ii) 0< on<izt 
an . Pip, 4 sin’}6 
= . 6 
(iii) wry, = ] 
(iv) fo, 6)d8 = x 
70 
j (v) [ r, 6)d@ = o(1) as pl for 0<i< x. 
7s 
Also 
= 5(p, 0) = 2p(1 — p®)sin 0d-*(p, 6 
9p PP 8) = <p p )si p, 8) 
| where 
1 _ 2 
) w i= 6 Pi. v 
d(p, @) 2pcos 6 + p b(n, 8) 
and 
3” 2 2 + 2 
3591 WP (o 8)P Co, t)sin"}(s — t)] = po, s)p(o, t) « 
sot 
[16p°(1 — p*)* sin s sin ¢ sin*}(s — t)d~*(p, s)d~*(p, t) — $p(p, s)p(p, t) 
. cos(s — t) — 2p(1 — p*)sin s sin(s — t)p(p, t)d~*(p, s) 


+ 2p(1 — p*)sin t sin(s — t)p(p, s)d~*(p, #)). 


Concerning the function G(s, ¢) we assume that 


; " i 2 f f 
li = sdt = 0 
| (15) tim ik Jo Jo IG(s, t) |dsdt 


connection.) Then using (15) we find 





and similarly for J>. Now 








h k 
f f IG(s, #)||dsdt| < Clhk|, (0 < |hl, |k| < 2), 
0 0 


Can absolute constant. (See (11) where these conditions were used in a similar 


lh} < rl — orc f |2p(p, t)cos }tdp(p, t)/dt — p’ (p, t)sin}t cos}t\tdt 
0 


= 0((1 — 0) f plo, tar = 0(1 — p) 
0 


me LEAL LLL ASS) = to 








68 JOSEPHINE MITCHELL 


By (15) it is found that 


lIu| < ef | b(p, s)p(e, t)dsdt, 


where «¢ is an arbitrary positive number, and 


af 1) 
Iu = (se F 


Consequently given e > 0 choose 6 > 0 so that 


| 3 f G(s, doa | < est}, 
0 Jo 


if |s| <6 and |é| < 6. With fixed 5 choose 1 — p sufficiently small. Then 
I = 0(e) for p sufficiently close to 1. Consequently we have proved 


THEOREM 2.1. Let F(s, t) be an integrable function. If G(s, t) = F(s,t) + 
F(2x — s, 2x — t) — 2V0S satisfies conditions (15), then 


ale pole 
lim av | | b*(p, s)p"(p, t)sin*}(s — t)F(s, t)dsdt = S. 
0 0 


ps1” 


For > 2 it would be sufficient to assume that 


On Pn 
(16) lim af Bae j |G(0)|d0, . . . d®, = 0, 


6; ,0 9; °° 


6; 6, 
f a |G(0)|\d0,.. . d0,| < ClO... |, (0 < |6,| < w,7 = 1,...,m), 
0 0 


where G(@) is defined similarly to G(s, t). We obtain 


THEOREM 2.2. Let f be an integrable function on the unitary group B such that 
the function F(@) defined by (9), where fy is the transform of f under (1), satisfies 
(16). Then the Poisson integral (1.3) has a limit if 2 approaches the point um 
on B radially. 


§3. Complete orthonormal systems on D. Orthogonal developments. 


1. Integration over D. Let z be an m by m matrix (n < m), z, its pth row and 
Z, the submatrix consisting of the first p rows (p = 1,..., n). The inner 
product (f, g) defined by (1.7) may be transformed into an iterated integral 
over the product of m hyperspheres by a procedure due to Hua (12), giving 


(1) (f,g) = (1) f = wuts... f -_ {0 ton, 


where 


I] dw,,d Woe 


k=1 


Wy = 2yI'p-1 


(2) Wy 


Zp = wpl>-1 (p= 1,...,", To = 1), 














(0 


Al 


Up 


wi 


en 


Nn), 


that 
fies 


L Uy 


its. 


and 
ner 
gral 
ving 








SUMMABILITY METHODS 69 


and I',_; is a unique positive definite matrix such that 
(3) As — (J — } a a el 


2. Construction of the matrix T,_;~'(2 < p < n). Let U,(g) = U(g) be the 
minor formed from the first g rows and columns of 


(4) U, = U = (U x) =fJ— Zs 1» 


U,(q) = MU (q) the corresponding submatrix and u,—, = (tém—j.1,---, ae | 
Since J — Z,_,Z*,-, is the leading (p — 1)th principal submatrix of J — zz*, 
it is positive definite, hence U is positive definite and U(q) are positive. Thus 
the hermitian matrix U may be reduced to diagonal form by the well-known 
Kronecker reduction (1) whose (j + 1)th step is 


U(m —j —1) 0 


* * X m.m—$ 
Vin... ViUVi... Vio = P . , 
* 
(0 <j < _m — 2), where 
apheae 0 
View = | —un_ Vo (m — jG —1) I" J, 


0 
Xn.m-j = U(m — j)U~"(m — j — 1), 
Xai = U(1). 
Also 


tim-jU*(m — j — 1) = U"(m —j — 1): 


((-yre u(t Bem 3) bbe 


.»m—j-1 
where 
ee ea - 4) 
1,...,m—-j-l 
is the minor of U formed from rows 1,...,m — j with row k omitted and 
columns 1,...,m—j-— 1. 


Now Vj..7! equals V,,; with the sign of the matrix —u,,;@~-'(m — j — 1) 
changed. Also X,,, is real. Hence we can take 


.*—1 


(5) pest os Vr"... VatlXbs,..., Xda 


By an inductive proof we may show that 











70 JOSEPHINE MITCHELL 
v(2) o(?:4) y(h2--- = 13) 
6) Vr...Ve,< / “\a, 2) oe Fy “— 
il at U(1) U(2) °° U(j — 1) el 
(1 <j < m). (For details of the proof cf. (15).) From (4) it follows that 
1,..-» hr) p{te---+ hr) 
aa) sa: 
p—l 
aet(3, = 7 zu) ° 
l= 


(gj=1,...,4—Lrk=l,...,t:4++1<E7r<¢m,1<4 < m—1). 





— 


3. Formula for z,,. From (2), (5), and (6) follows 


a ee 
(8) or = Do Goel (3 a o "Va + Cprsttys(P > 2) 
tn = wi <r cm; > -0), 
where 
(9) G: = (U,()U,G — DI? <i<r—1) 


Corr = [U,(r)/U,(r — 1)]} 
Qi = arr = i 
The formula 
p—1 q 
(10) U,(q) = IT (1 = > ws hes) (p > 2) 
holds. 


We prove (10) by induction on p. Since by (8) 
q 
U2(q) = det(I — 2}2;) = det(I — 223) = 1 — ZZ. W1fD1,, 
j=l 


where 2; = (2i1,.. , 21g), (10) is true for = 2. Assume (10) holds for U,(q) 
and prove for U,,:(q). Since U,(q) # 0, there exists a unimodular matrix 


A such that the matrix ¥,41(q) = (64 — Dieitnd:) (1 < j,k < p) is equal 
to 


A[%(q), 1 — %W'(q)zA*, 
(12), and thus 
Usiilg) = det %ii(q) = det Y(q)(1 — 2,%}(q)zs). 


Since det ¥,(¢) = U,(q), using (10) we need only consider the last factor 
on the right, which equals 





E = 1 —_ > Sp jU e j2pe U,(q), 


q 
= 


Pe 











it 

; 
1). 

; 
, 
, 
) 
} 

(9) 

ix 

ual 

tor 








SUMMABILITY METHODS 71 


where U,, is the cofactor of the element ~, in the matrix %,(q). Substitute 
(8) for ,, and Z,,. The term w,, occurs when j = ror wheni = randj =r +1, 

.,q and ®, occurs when k = \ or i = X and k =} +1,...,¢. Thus the 
coefficient, Cy, of w,,t,(1 <r, A < gq) is 


r : > Ey eee ’ j 
Ca = U; “a cr 7 0,( tr] Vp, + conDa | ’ 
j=r+l1 1, eee? 
where 


: ei Bescce me , 
Da = Gp » Uit( a] ) + Uxrsem. 


k=A+1 Bo cocgt 


By means of elementary properties of determinants it is not difficult to prove 
in case A qr that Dy = 0 for j=r,\ <randj=r+l1,...,qA<r. 
Hence C, = 0 for \ # r. Also C,, = 1. A similar proof holds for r < \. Thus 


Ee=i- ) Wy jp j 
j=l 
and U,.:(g) has the desired form, which proves (10). 
4. Structure of the CONS on D. 
THEOREM 3.1. (P, Q) # 0 implies equations (1.8). 


Proof. We first show that 


(11) Zpr = Wpr > B Wy tyr I] (Wy rer 91 s+ Wy apr 18;Wr,; Dr«8;), 

where B, is a function of wy, (1 <7 < p — L)sA1,..., A, A, take on values 
in the set 1,2,...,p — l;a,,...,a@,isa subset of 1,...,7— land (@;,..., 
8,,8,) is a permutation of (a;,...,a,%)(¢ =1,...,7 — 1). (Notice that 


each term of (11) can be grouped into pairs wagt},; in two ways: (i) each pair 
belongs to the same row, (ii) each pair belongs to the same column.) Since 
Zi, = m,, (11) holds for p = 1. Now assume (11) for p — 1, 1 < r < m, and 
prove for p. Upon expanding 
~; ae 
U. , , 2 , 
4 


(given by (7)) and multiplying out the resulting factors, we find that its 
general term is 


Zr1012A181 eee Bh gash gq Dorit eho+1Bi9 
where Ag(a = 1,..., 5 + 1) takes on values from 1, 2,...,p — ljaj,...,@ 
is a subset of 1,...,7— 1 and 


Bas» eeey Bass B, 


is a permutation of a;,...,a@,,7. Thus the general term of z,, would be 
w,, times 








72 JOSEPHINE MITCHELL 


, " ‘ . . 
Cp ®ry01%r1 Bay «+ + DroeredgBereZo+1 Ae +18; p iW pr, 
Cot = Cpi/WyrDyr. Replace 
2d101d2 A181 eee by Wh eri Bar « - + 


times a factor which by induction already has the required form and (11) 
follows. 

Now consider z,,*»". From (11) we see that except for the first factor w,,"»" 
if w,,” occurs, then a factor #,,’ also occurs and if w,,’ appears, then #,,” also 
appears. Consequently in the expression for z,,*»" for each v(v = |,..., p — 1) 
the sum of the exponents of the factors w,,(j7 = 1,..., m) equals the sum of 
the exponents of the #, (k = 1,...,m) and the sum of the exponents of 
w,, equals the sum of the exponents of #,, increased by s,,. Similarly for the 
columns. Thus P can be expressed in the form 


(12) P(z) = P.(wb) [] wii* = Po(wid)P(w), 


where P,(wid) contributes the same exponents to the sum of the elements in 
the vth row of w and of # and similarly for the columns of w and #. An analo- 
gous expression holds for Q. 

In (P, Q) replace P and Q by (12). Since {:.**} (Rk = 1,..., m) forms an 
orthogonal set on w,w*, < 1 (2), (P,Q) #0 1 and only if for each & the 
exponent of w,, equals the exponent of #,,.. Con»»quently if we sum the ex- 
ponents of the »vth row, owing to the form of Po(::), Qo(ww) we obtain the 
first of equations (1.8) and summing the exponen: of the wth column the 
second of equations (1.8). Thus the theorem is prove: 


5. CONS. Orthogonal development. A CONS is construc*+d from the set of 
powers {P(z)} as follows. Let a = (a;,...,@m4n) be a se. of non-negative 
integers with > j2;" a; = Dofus arin = P. The powers of t:. set S(a) = 
S(ay, ..- 5 min) such that 


> Sn = Gy, Zz Sip = Opin 
: 


Jj 


need not be orthogonal to each other. (There exist sets S(a) whose members 
are not all orthogonal to each other—for example, in the 2 by 2 case the ele- 
ments 21:22, 212221 are not orthogonal (12).) However if P € S(a), Q € S(s), 
where a; * 8; for some 7, then by Theorem 3.1 (P,Q) = 0. We order the 
elements of the set S(a) in some convenient manner into a sequence Po, P;,.. . , 
Py). An ONS is constructed from these elements by the Gram-Schmidt 
formulas 


oe 
(13) ¢” (2) = det| °°" * °°" (D,_,D.), 
Go, v-1-- - Gy v1 


where 











SUMMABILITY METHODS 73 


D, 
aij = (P;, P;), (0 < i,j < v). 


det (das) (OC a, 8B Cus u=v—lorv,» #0), D_, = 1. 


Now order the system {¢,} into a sequence ¢), ¢2, . . . . The orthogonal 
development of any f € L? with respect to the ONS is 
(14) > aed, 


where a, are the Fourier coefficients, (f, ¢,), of f. From Bergman's theory 
(2) it is known that (14) converges absolutely and continuously to f on D 
(continuous convergence means that the series converges uniformly on any 
compact set contained in D). 


§4. Applications of the CONS. 


1. Abelian and Tauberian theorems. Let {a,} be an arbitrary sequence of 
numbers and consider the behaviour of (3.14) as z € D approaches a point 
uo € B. In particular let wp = [J,0], J = J™, and z— up» along the set of 
points [r,0] where r is the diagonal matrix [r:,..., 7], 0 <7; < 1. When 
z = [r, 0] it is seen that P, of the set S(a) is either equal to 0 or to 


P) 
IT +; 


in case a; = @j4, = Pj forj = 1,...,manda,,, = 0 forj > n. Consequently 
for a fixed set a either all P, are zero or there is one P, different from zero in 
case a, = aj, forj7 = 1,...,m and a,,, = 0 for 7 > n. Order the elements 


of the set S(a) so that the non-zero term is the last term of the set. Then in 
(3.13) only ¢,” (z), t = p(a), is different from zero when z = [r, 0] and 


o? (2) = [Di1/D,}P Az) = (Di1/D Jr... 2. 


Thus (3.14) reduces to a multiple power series. Let this series be summed by 
the usual method for power series. Then 


(1) Sb @ Boss Be Gel... 
pi=0 Pn=0 
Cp1...22 = a,/(D,-1/D,)', 


where ¢,” = ¢, is the ordering of the ONS {¢,} into a simple sequence. 
Let 


Sa. an 
be the partial sum of S(r): 
71 qn 
= Pi in 
Sa @n ) $a ) » Cpy...pnt 1 Tr 
p\=0 Pn=0 


The following Abelian theorem is valid: 
THEOREM 4.1. Jf S(J) exists and 
Sor -+-aal < C, 











74 JOSEPHINE MIICHELL 


where C is an absolute constant, then S(r) is uniformly convergent for0 <r, < 1 
and lim,.,,S(r) = S(J). (r ~J means [r, 0] — [J,0].) See (4) for a proof. 


Also a Tauberian theorem proved by Knopp (7) for double series may be 
extended to multiple series. 


THEOREM 4.2 (Tauberian theorem). Let the series S(r) converge for each 
[r,0] € D, r = [ri,..., 1%], and for these r let |S(r)| < K, where K is an 
absolute constant. If 


(2) lep,..a0l (Pit... +p)" <M < @, 
then lim,_, ;S(r) = S implies S(J) = S, that is, 


(3) re a er 


pi=0 pn=0 


In order to prove Theorem 4.2, Theorems 3 of § 3 and the proofs in § 4 of 
Knopp’s paper must be proved for n-fold series (m > 3). Using condition (2) 
Theorem 3 has been extended to multiple series in (18). Also by means of (2) 
the proofs in § 4 follow for multiple series. In addition see (13) where a similar 
condition is utilized for multiple series summed spherically. 

On the other hand if we let z — [J, 0] along the set [pJ, 0], 0 < p < 1, and 
sum series (3.14) by diagonals: 


(4) So(p) = > byp’, 
p=0 

where 

(4a) b= 2, Cnr... 


Pit...+Pn=—p 
then (4) is a simple series and the boundedness conditions on S,, ... ,, and 
S(r) can be omitted in Theorems 4.1 and 4.2 (17). Abel’s theorem reads if 
So(Z) exists, then So(pl) is uniformly convergent for 0 < p < 1 and lim,.,;So(pl) 
= S,(Z) and Theorem 4.2 becomes 
Let the series So(pI) converge for each |pI,0| © D, (0 < p < 1). Jf 6, = O(1/p), 
then lim,,,So(pI) = So implies So(I) = So. 


2. Cauchy's inequality and mean value theorem. In the next paragraphs it is 
convenient to group the elements of the CONS {¢,} of same degree, hence, 
let 

1, --+5 OM, 

be the terms of degree p. Then for any f € L? on D 

oo M 
(5) fe)= dL LY aPo?@), 

p=0 = j=l 
where the convergence is continuous and absolute for z © D. Multiply (5) 
by f and integrate over the domain 
(6) D, = [z\p?J — 2z* > 0,0 < p < 1). 























SUMMABILITY METHODS 75 


. . ’ ' 
Clearly D, # D for 0 < p < 1. Since the convergence of (5) is uniform and 
absolute on D, 


© Mp ~o M . 
(7) I(p) = op | ffawe=v' > > > a? a? | o? é° dW, 
Dp p=0 j=l gq=d k=l / Dp 


(V, being the volume of D). In the integral 
I -f oo. &°dW, 
Dp 
set s = pw. Then D, > D, dW, — p*""dW,, (cf. § 3.1) and 


(p) ' 
o) (2) = D> zajy<p constant [] 2%{*— po? (w). 
Hence 
I= gett g?”, on”) = pn *4 ba. 


Also V, = pV» where Vo is the volume of D. Thus (7) becomes 


1 1 r : Mp “— 
(8) I(p) = +f ifaw - Y, > p” pm a? - 
p/Dp 0 p= j=0 
From (8) 
1S , 2 , 2 ‘f2\ {2 
Y, lay’ |" < (1/p”)max|f(z)|°. 
0 j=0 zeDp 


Now according to a theorem proved by Hua (6) if f is analytic on the closed 


circular domain D,, then f attains its maximum modulus on the circular 
manifold 


(9) B, = [s\z2* = p?/). 


Hence we get the Cauchy inequality: 


1 > |, (p))2 / 2p | \2 
rk a; | < (1/p”) max |f(z)}°. 


V 
V j=0 zeBp 


(10) 
THEOREM 4.3 (mean value theorem). J(p) defined by (7) is a monotone 
increasing function of p(0 < p < 1) and log I(p) is a convex function of log p. 


Proof. The monotonicity of J(p) is obvious from (8). The proof of convexity 
is the same as the proof in (17, p. 174) for the one variable case. 


3. A mean value theorem over B,. 


THEOREM 4.4. In the case n = m the integral 


(11) Ih(p) = J Ifl’aVv 


is a monotone increasing function of p and log I;(p) is a convex function of log p. 


Proof. Hua (6) has shown how to construct a set {y,} orthonormalized with 
respect to the inner product 














76 JOSEPHINE MITCHELL 


Yn Vee = f veda 


from the CONS {¢,} as follows. Since B is a circular space (¢;”, ¢&)s = 0 
if p # q. Define the vector 
Zp = (¢:, ae | ou,)- 
Then 
(25, 2)e = ((¢)', o& a) = Ky 


is a matrix of constants. Since z,’Z, > 0 if 


e* — | 4{7)|2 (p) \2 

ZpSp = [G1 | +... + Oa! 
is positive, K, > 0 and there exists a unitary matrix U such that U*K,U = A, 
where A is a diagonal matrix with positive elements on the diagonal. Now 


{yp}, defined by 
Yo = BU = (6:”,..., O87), 
is a CONS on D if {z,} is, since 
((z,U7)’, 2,U) = U*(z}, 2,)U = U*((o?, o”))U = U*IU = I. 
Let 
- - a” e” om 
Then {¥;”} is an ONS with respect to integration over B and the orthogonal 
development (5) of f € L* can be written as 
: Mp 
(12) f(z) ~“ : > by”, 
p=0 ro | 
where 
) (p) rl (p)i 2 
b? = (f, v; )/ || ; . 
Multiply (12) by f and integrate over B,. By a procedure similar to that in 
paragraph 3 we obtain the formula 


. r M, 
(13) ho) = J, stay = Xo” De ay’ 


p=0 j=l 


from which the conclusions of the theorem follow. 


Note added in Proof. Recently it has come to my attention that Hua and Look (21) have 
proved that F(z)—>/(u,) as zu, in any manner. Further for continuous f on B, the solution 
F given by (3) is unique. They also consider Abe! summability for continuous functions of 
the unitary group. 


REFERENCES 
1. A. A. Albert, Modern higher algebra (Chicago, 1937). 
2. S. Bergman, The kernel function and conformal mapping, Math. Surveys, No. V (1950). 
3. S. Bochner, Group invariance of Cauchy's formula in several variables, Ann. Math., 45 
(1944), 686-707. 





0 


ve 
on 


of 


~~ 
Qa 





il. 
12. 


19. 
. A. Zygmund, Trigonometrical series, Monog. Mat., Tom V (Warsaw, 1935). 
21. 


SUMMABILITY METHODS 


~l 


~“ 


. T. J. 'A Bromwich and G. H. Hardy, Some extensions to multiple series of Abel's theorem 


on continuity of power series, Proc. London Math. Soc., 2 (1905), 161-189. 


. E. Cartan, Sur les domains bornés homogénes de l’espace de n variables complexes, Abh. 


Math. Sem. Univ. Hamburg, 1/1 (1936), 116-162. 


. L. K. Hua, On the theory of functions of several complex variables, 1-111, Acta Math. Sinica, 


2 (1953), 288-323; 5 (1955), 1-25, 205-242; (in Chinese), M.R., 17 (1956), p. 191. 
Also Harmonic analysis of the classical domain in the study of analytic functions of several 
complex variables, mimeographed notes (about 1956). 


. K. Knopp, Limitierungs- Umkehrsdtze fiir Doppelfolgen, Math. Z., 46 (1939), 573-589 
. D. B. Lowdenslager, Potential theory and a generalized Jensen-Nevalinna formula for 


functions of several complex variables, Jn. Math. Mech., 7 (1958), 207-218. 
- Potential theory in bounded symmetric homogeneous complex domains, Ann. Math., 
67 (1958), 467-484. 


. C. C. Macduffee, The theory of matrices (Ergebnisse der Mat. und ihrer Grenzgebiete 


[Chelsea, 1946]). 
J. Mitchell, On double Sturm-Liouville series, Amer. J]. Math., 65 (1943), 616-636. 
—— An example of a complete orthonormal system and the kernel function in the geometry of 
matrices, Proceedings of the Second Canadian Mathematical Congress (Vancouver, 
1949), 155-163. 
— On the spherical summability of multiple orthogonal series, Trans. Amer. Math. Soc., 
71 (1951), 136-151. 
Potential theory in the geometry of matrices, Trans. Amer. Math. Soc., 79 (1955), 
401-422. 
— Orthogonal systems on matric spaces, West. Research Lab. Pub., Scientific Paper 
60-94801-1-P2 (1956). 


- K. Morita, On the kernel functions of symmetric domains, Sci. Reports of Tokyo Kyoiku 


Daigaku Sec. A, 5 (1956), 190-212. 


. E. C. Titchmarsh, The theory of functions (2nd ed., Oxford University Press, 1939). 
. S. H. Tung, Tauberian theorems for multiple series, unpublished Master's thesis (The 


Pennsylvania State University). 
H. Weyl, The classical groups (Princeton University Press, 1946). 


L. K. Hua and K. H. Look, Theory of harmonic functions in classical domains, Sci. Sinicd, 
8, No. 10 (1959), 1032-1094. 


The Pennsylvania State University 











A THEOREM ON PARTIALLY ORDERED SETS, 
WITH APPLICATIONS TO FIXED POINT THEOREMS 


SMBAT ABIAN anp ARTHUR B. BROWN 


In this paper the authors prove Theorem 1 on maps of partially ordered 
sets into themselves, and derive some fixed point theorems as corollaries. 

Here, for any partially ordered set P, and any mapping f: P — P and any 
point a € P, a well ordered subset W(a) C P is constructed. Except when 
W(a) has a last element £ greater than or not comparable to f(), W(a), al- 
though constructed differently, is identical with the set A of Bourbaki (3) 
determined by a, f, and P,: {x|x € P, x < f(x)}. 

Theorem 1 and the fixed point Theorems 2 and 4, as well as Corollaries 2 
and 4, are believed to be new. 

Corollaries 1 and 3 are respectively the well-known theorems given in (1, 
p. 54, Theorem 8, and Example 4). 

The fixed point Theorem 3 is that of (1, p. 44, Example 4); and has as a 
corollary the theorem given in (2) and (3). 

The proofs are based entirely on the definitions of partially and well ordered 
sets and, except in the cases of Theorem 4 and Corollary 4, make no use of 
any form of the axiom of choice. 

In what follows, ‘“‘a < 5” implies that a and 6 are distinct. Furthermore, we 
shall always deal with elements and subsets of a given partially ordered set 
P, and “lub T”’ will denote exclusively ‘“‘the least upper bound of T in P”’; 
that is, an upper bound z of T such that if s is any other upper bound of 7, 
then z < s. The symbol “‘C”’ shall mean “‘is a subset (not necessarily proper) 
of.” 


Definition. Let P be a partially ordered set and f a mapping of P into P. 
For any a € P, an a-chatn C, is a subset of P satisfying the following con- 
ditions: 

(1) C, is well ordered, with a as its first element and r as its last element; 

(2) If 2 € C, and zr, then f(z) € C,, ¢ < f(z), and there exists no 
y € C, for which z < y < f(z); 

(3) If T is a non-empty subset of C,, then the least upper bound (in P) of 
T exists and is in C,. 

It will follow from Lemma 4 below that, for given P, f, and a, C, is uniquely 
determined by r. 

We designate by W(a) the set of all r € P for which there exists an a-chain 
C, having r as its last element. We note that (2) implies that W(a) = {a} 
except when a < f(a). 


Received September 1, 1959. 

















ON PARTIALLY ORDERED SETS 79 


Under the hypotheses of the Definition, we shall first prove the following 
lemmas. 


LemMA 1. If r € W(a) and C, is an a-chain with last element r, then C, C 
W(a). 


Proof. If t € C,, the set of all elements of C, which are < ¢ is easily seen 
to be an a-chain, and hence t € W(a). Therefore the lemma is true. 


LemMA 2. Ifr € W(a) andr < f(r), then f(r) € W(a). 
Proof. The set C, \U {f(r)} is obviously an a-chain, and hence f(r) € W(a). 


LEMMA 3. If r,s © W(a) and C, is an a-chain with last element r, then either 
5 Corr < $. 


, 


existsandz € T.Ifs ¢C,thenz # s. Ifalsoz # r then, by (2), z < f(z) 3 
contrary to the fact that z = lub 7. Hence s ¢ C, implies that z 

r © C,; and since r # s, we see by (1) that r < s. Since, by (1) 
r < s cannot both hold, we infer the truth of Lemma 3. 


= r, so that 


Proof. Let T = C,(\C,. By (1), T #@ and hence, by (3), z = lub 7 
,s € C, and 


LemMaA 4. Jf r € W(a), there is just one C, with last element r, namely the 
set of all elements of W(a) which are < r. 


Proof. This follows from Lemmas 1 and 3. 

THEOREM 1. Let P be a non-empty partially ordered set, f a map of P into P, 
and a an arbitrary element of P. Then 
(4) W(a) is well ordered with a its first element. 
Moreover, if § = lub W(a) exists, then 


(5) W(a) is an a-chain with & its last element, 


and 
(6) E + f(g). 

Proof. Let H be any non-empty subset of W(a),andr € H.Sincer © H(\C,, 
we see by (1) that H()\ C, has a first element, which, in view of Lemma 4, 


is the first element of H. Hence W(a) is well ordered. By Lemma 1, a € W(a), 
and by (1), if r € W(a) then a < r. Thus we conclude that a is the first 
element of W(a). Hence (4) is valid. 

Next, assume ~§ = lub W(a) exists and let W* = W(a) LU {£}. We shall 
show that W* is an a-chain. Since W(a) is well ordered, W* is well ordered too 
and thus (1) is satisfied for W*, with a its first and & its last element. Now, 
let z © W* and z # &. Then z € W(a) and {x|x € W(a), z < x} # @. Sinec 
W(a) is well ordered, z has an immediate successor r in W(a), hence in W*. 
By Lemma 4, z and r are the last two elements of C,. Hence, by (2) applied to 
z as an element of C,, we see that f(z) = r, so that (2) is satisfied for W*. 
To prove (3) for W*, let T be any non-empty subset of W*. Obviously & is an 











80 SMBAT ABIAN AND ARTHUR B. BROWN 


upper bound of 7. If there is no element of W(a) which exceeds every element 
of T then, in view of the well orderedness of W(a), any upper bound of T 
is also an upper bound of W(a) and hence is > £, which implies that — = lub 7, 
and thus lub T € W*. If there is an element r € W(a) which exceeds every 
element of 7, then TC W(a) and, by Lemma 4, 7 C C,. Hence, by (3), 


lub T exists and is in C,, and therefore, by Lemma 1, lub 7 € W(a), so that 
again lub T € W*. Consequently (3) is satisfied for W*. Therefore W* is an 
a-chain with ¢ its last element, which implies that —& € W(a) and W(a) = W*. 


Thus (5) is valid. 
Now, suppose ~é < f(t). By Lemma 2, f(t) € W(a), so that (5) is con- 
tradicted. Therefore — ¢ f(€). Thus (6) is valid, and Theorem | is proved. 


THEOREM 2. Let P be a partially ordered set in which 
(7) lub of every non-empty well ordered subset W C P exists. 
Let f be a map of P into P such that f is isotone, that is, 
(8) for every two elements x, y € P with x < y, we have f(x) < f(y); 
and 
(9) there exists an element a © P witha < f(a). 


Then there exists at least one & € P such that § = f(&). In fact, § = lub W(a) 
ts such an element. 


Proof. lf a = f(a), the conclusion is obvious. Now suppose a < f(a). 

Consider the set W(a), where a is the element referred to in (9). By (4) 
and (7), € = lub W(@) exists, and hence by (5), W(a) = C;. By (9) and 
Lemma 2 we see that f(a) € W(a), and therefore a < £. Since W(a) is an 
a-chain and W (a) — {£} isnon-empty, we infer from (3) that 6 = lub[W (a) — {€} 
is in W(a) = C;. According as @ = £ or 6 < £, we have 


(10) & = lub [W(a@) — {&}] 
or 
(11) £ is the immediate successor of 6 in W(a). 


If (10) holds, take any element z € [W(a) — {£}]. Then z < &, and by 
(8), f(z) < f(é). By (2), 2 < f(z). Consequently z < f(&) and therefore /(é) 
is an upper bound for [W(a) — {&}], and thus, by (10), — < f(é). 

If (11) holds, by (2), f(@) = &. Also, since 6 < &, by (8), f(@) < f(&), so that 
again ~ < f(&). 

Since — < f(£), we see from (6) that — = f(t). Thus Theorem 2 is proved. 


Remark. An alternative proof of Theorem 2 can be given by considering the 
set {x|x € P, x < f(x)} and using Theorem 1. 


CorROLLARY 1. Let f be any isotope map of a non-empty complete lattice L 
into itself. Then & = f(t) for some — € L. 


Proof. In view of Theorem 2, we need only verify (9). Choose a = the greatest 














ON PARTIALLY ORDERED SETS 81 


lower bound of L. Then clearly (9) is valid, and Corollary 1 follows from 
Theorem 2. 
CoROLLARY 2. Let P be a partially ordered set in which 


(12) every non-empty well ordered subset W C P which is bounded above has a 
lub. 


Let f be an isotone map of P into P and let there exist two elements a, b € P 
such that 
(13) a<f(a) < f(b) < 6. 

Then there exists € P suchthatt = f(t) anda < & < b. In fact, = lubW(a) 
is such an element. 


Proof. Let Q = {x|x € P,a < x < b}. Since f is isotone, we see by (13) 
that ifx € Q,thena < f(a) < f(x) < f(b) < 6. Hence f maps Q into Q. More- 
over, since Q is bounded above by 38, we see from (12) that (7) is valid for Q. 
Therefore the hypotheses of Theorem 2 are satisfied by Q, f, and a. Thus from 
Theorem 2 we infer the validity of Corollary 2. 

COROLLARY 3. Jf f is an isotone map of a conditionally complete lattice into 
itself and if a < f(a) < f(b) < b, then — = f(E) for some E witha < E < b. 

THEOREM 3. Let P be a non-empty partially ordered set in which 
(14) lub of every non-empty well ordered subset W C P exists. 

Let f be a map of P into P such that 

(15) for eeryx € P, x < f(x). 

Then there exists at least one § € P such that § = f(&). In fact, for everya € P, 
§ = lub W(a) ts such an element. 

Proof. Consider an a-chain W(a) C P. By (4) and (14), & = lub W(a) 
exists. By (15) and (6), & = f(&). Thus Theorem 3 is proved. 

In the following a generalization of Corollary 2 is proved with the help of 
the axiom of choice. 

THEOREM 4. Let P be a partially ordered set in which 
(16) lub of every non-empty well ordered subset which is bounded above exists. 
Let g be a map of P into P such that, for every two elements x, y € P, 


(17) if g(x) < gly), thenx < y; 
and, forx,y,s € P, 
(18) uf g(x) < s < gly), then g"(s) ¥ @. 


Furthermore, let f be an isotone map of P into P, and let there exist a, b © P, 
with a < b, satisfying 


g(a) < f(a) and f(b) < g(d). 
Then the reexists at least one & € P such thata < § < b and f(&) = g(&). 








82 SMBAT ABIAN AND ARTHUR B. BROWN 


Proof. lf f(a) = g(a) or f(b) = g(b), the conclusion is obvious. Hence we 
may assume that 


(19) g(a) < f(a) and f(b) < g(d). 

Consider the set {.S,} of all non-empty subsets S, C P such that there 
exists s, € P with g~'(s,) = S,. Clearly, {S,;} # @. By the axiom of choice, 
there exists a function ¢ mapping {S,;} into P, such that ¢(S,) € S;. Hence 
(20) geg '(s:) = Sy. 

We observe also that, in view of (17), 

(21) if s; < s,, then every element of g-'(s,;) < every element of g~'(s,). 

We shall show now that the function 
(22) h = og 'f 


maps the set Q = {x|x € P,a < x < 5} into itself. If x € Q, then, since f 
is isotone, by (19) we have 

(23) g(a) < f(a) < f(x) < f(d) < g(d), 

and hence by (18) we see that g-'[f(x)] # @. By (21) and (23) we find, 
a < gg "[f(x)] < 6. Hence, by (22), k(x) € Q. Taking x = a, we infer also 
that 

(24) a < h(a). 


Furthermore, since f is isotone, if x < y then f(x) < f(y), and from (21) we 
infer that gg—'[f(x)] < ¢g'L[f(y)], so that by (22) h is isotone on Q. 

From (24) we see that a and h satisfy (9) on Q. Also, since Q is bounded 
above by 6, we see from (16) that Q satisfies (7). 

Hence Q and h satisfy the hypotheses of Theorem 2, and consequently there 
exists £ € Q such that A(£) = &. Applying g to each side we have, by (22) 

gee "Lf(é)] = g(€), 
and thus, by (20), 
f(&) = g(é). 


This completes the proof. 


COROLLARY 4. If in Theorem 4 instead of condition (17) we assume that g is 
isotone, then the conclusion of Theorem 4 remains valid provided P is a simply 
ordered set. 


REFERENCES 


G. Birkhoff, Lattice theory (A M.S. Coll. Publ., 25, rev. ed.; New York, 1948). 
N. Bourbaki, Théorie des ensembles (Paris, 1956), chapter 111, p. 48, Example 6b. 


1. 
a 
3. Sur le Théoréme de Zorn, Archiv der Mathematik, 2 (1949-50), 434-437. 





University of Pennsylvania 
and 
Queens College 











ARITHMETIC LINEAR TRANSFORMATIONS 
AND ABSTRACT PRIME NUMBER THEOREMS 


S. A. AMITSUR 


1. Introduction. Shapiro and Forman have presented in (4) an abstract 
formulation of prime number theorems which includes the various prime 
number theorems; for primes in arithmetic progressions, for prime ideals in 
ideal classes etc. The methods of proofs are “elementary”’ and follow closely 
Shapiro’s proof for the primes in arithmetic progression (for reference see 
bibliography in (4)). 

The author has followed in (1) some ideas of Yamamoto (5) on arithmetic 
linear transformations to introduce a symbolic calculus in dealing with arith- 
metic functions. This calculus proved to be very useful in unifying many of 
the “elementary”’ proofs in the behaviour of arithmetic functions. In (6) 
Yamamoto has extended his theory to ideals in algebraic number fields, and 
with this extension the symbolic calculus of (1) can be extended to cover the 
abstract case of prime number theorem in countable free abelian groups as 
discussed in (4). Furthermore, a more careful study of the behaviour of certain 
“remainders” yields a more general result in the direction given by Beurling (3). 

Shapiro and Forman have considered the following situation. Let G be a 
free abelian group on a countable number of generators p, (¢ = 1, 2,...,). 
N:G — N be a homomorphism of G into the multiplicative group of all integers 
®, with the kernel G’ such that G/G’ is finite. If His a generic class of G/G’, 
and w is an integral word in G, then the main result of (4) is deriving from the 
condition 
(1.1) 7 1 = cyx + Ry(x); cy > 0, ) Ca > 0 


Nw<t 
weH 


a “prime number theorem”’ for the class H: 


; x x 
(1.2) Ty(x) = 2%! = dy log x + o(;*-) ' 
peH 

A complete analysis of the coefficient of dy was given in (4) for the case 
Ry(x) = O(x*) with 1>@ > 0. The methods developed in the present paper 
will show that the same results are valid even if Rg(x) = O(x/log%x) with 
y > 2. A result of a similar nature, though in a completely different situation, 
was given by Beurling (3) for y > #. 

It is quite surprising that for y > 3 (and in certain cases for y > 4) the 
methods and the results of (1) can be carried over to the abstract case almost 


Received April 27, 1959. 
83 











84 S. A. AMITSUR 


without any change whereas 3 > y > 2 involves many refinements of the 
methods and of the main “elementary proof’’ of (1). In fact, some of the 
equivalent forms of the prime number theorem cannot be proved by our 
methods for 4 > y > 2; though others can be shown for these values of y 
relatively easily, their classical proofs of implying the prime number theorem 
breaks down if y < 3. 

We take this opportunity to present also the symbolic calculus of (1) in a 
more general and in what we hope is a simplified form. An application is also 
given to show (by elementary methods) that ¢(1 + it) ¥ 0 for t #0 and 
that > p—'** converges. 


2. The semi-group W and its characters. In the present context we 
prefer to consider the semi-group W of all integral words in G, and similarly 
W’ = WC\G’. In this way the group K of all characters of G/G’ (4) is replaced 
by a finite group of characters of W. To be more precise, we assume the 
following: 

Let W be a free abelian multiplicative semi-group generated by a countable 
number of generators p,;. Let V bea homomorphism of W into the multiplicative 
semi-group %t of all positive integers, that is, Nw is an integer and 
N(wyw2) = Nw, . Nwo. 

Let K be a finite group of characters of W. By a character x € K, we mean 
a homomorphism of W into the complex numbers. The unit xo € K is defined 
as xo(w) = 1 for all w € W. Multiplication in K is given by: 


(2.1) (xn) (w) = x(w)n(w). 


Let K be a finite group of order h, then it follows readily by (2.1) that 
x(w) is an hth root of unity. Furthermore, each w € W determines a character 
of K by setting #(x) = x(w). Thus the mapping w — @ is a homomorphism 
of W into the group K of all characters of K. Let W’ be the kernel of this map, 
that is, 

W’ = {w;w € W,x(w) = 1 for all x € K}. 

This readily implies that W/W’ is a finite group of order < order of K = order 
of K = h. Now the classes H of W/W’ are determined by the group of 
characters K; that is, u, v belong to the same class H if and only if x(u) = x(v) 
for all x € K, or in other words if and only if @ = 6. On the other hand, K 
is readily seen to induce a group of characters on the finite group W/W’, 
and from the definition of the classes of the latter it follows that different 
characters of K induce different characters of W/W’. Consequently h = order 
of K < order of W/W’. Combining this with the previous result, we obtain: 


PROPOSITION 1. W/W’ is a finite group of order h, and K can be considered 
as the group of all characters of W/W’. 


In many cases the converse situation is preferred. Namely, given W’ C W 
such that W/W’ is finite, we define K to be the group of all characters of W/W’, 











ABSTRACT PRIME NUMBER THEOREMS 85 


and then x(w) is defined to be x(H) where w € H the class in W/W’. In any 
case, we shall always use the notation x(H) and x(w) for the same character x. 
Now the standard relation between characters yields: 


JO if x#ij 

la if x= 

ms jo if uw, v belong to different classes of W/W’ 
(2.3) p> x(u)x(v) = tz if u, v belong to the same class. 


(2.2) p> x(H)n(H) = 


Next we assume that for any class 


(2.4) By(x) = > 1 = Cyx + O(x/log’x);Cy > 0, > Cy > 0. 


Nwer 
wel 
We define 
(2.5) Va(x) = > log Np; ty(x) = os 1. 
Np' <r Npar 
p'eH peH 


Analogous to the results of Shapiro and Forman (4), we shall show that the 
character can be distributed into three classes [;, T's, I's. I, will contain all 
character for which A, = }-yx(H) Cy # 0, T: and I; will be defined later 
in §8. Our first result is: 


THEOREM A. If y > 2, then 


x + o(x) if Xx lr; 
> x(p') log Np = 4 o(x) fw €Ps 
Np'*<z : 
_—x + o(x) if x lr; and y > 3. 
Let U \w: x(w) = 1 forall x € T,}, and U* = {w;w € U, x(w) = 1 for 


allx € T3}. Then W’ C U* C U and as in (4, Theorem 3.1): 
THEOREM B. [/f (2.4) holds then: 


W(x) = dy =< + dt ) . 
log x log x 
if (a) T3 = 0, y > 2, where: 


4, - 50. for ¢U/W’ 
4H \n7 order T; for H € U/W’ 


or (b) TT; + O, y > 3 where: 


JOforH ¢U/W'orH € U*/W’' 
\2h-' order T, forH € U/W'orH ¢U*/W’. 


dy = 


3. The ring C(W). Let C(W) be the set of all complex valued functions 
of W. As in (5, p. 42), C(W) is a ring with respect to the addition 


(3.1) (f + g) (w) = f(w) + g(w); w © W, 











86 Ss. A. AMITSUR 


and the convolution: 


(3.2) (feg)(w) = >> f(u)g(v). 


uo=w 


We shall also use the ordinary multiplication: 


(3.3) (fg) (w) = f(w)g(w). 
The ring C(W) is in fact a commutative ring with the unit e defined. 
(3.4) e(1) = 1, e(w) = 0 for w = 1 (the unit of W). 


As in the classical case W = § the integers, (1,5) it is easily shown that 
the invertible function f © C(W) are those for which f(1) # 0, and in this 
case f~'(w) is defined by induction on the length of the words. 


(3.5) f'(1) = 1/f(1);f-'@) = - |= f-*(u)f (wu »|/ sa), ~ w. 


Let E be the “one’’ function defined E(w) = 1 for all w € W, then its 
inverse E~-! = yp,» = uw is the Mobius function for W: 
(3.6) u(w) = (—1)’ if w is the product of r distinct generators 

u(p) = 1, and zero otherwise. 

A function f is said to be multiplicative if: 
(3.7) f(uv) = f(u)f(v) for (u,v) = 1 
where (u,v) = 1 means that u,v have no common divisor #1 in W. If (3.7) 
holds for all u, v without any restriction, then we say that f is factorable or fis 
a character. Another type of functions which we meet are the additive functions 
which satisfy: 
(3.8) f(uv) = f(u) + f(r). 

In the general case of arbitrary semi-group W as in the case of the integer 


(5) we have: 


PROPOSITION 2. If f is a character, then the mapping: g — gf is an isomor- 
phism of C(W) into itself. In particular: (g*h)f = (gf)*(hf). 

If f is an additive function, then the mapping: g — gf is a derivation of C(W). 
In particular: (g*h)f = (gf)*h + g*(hf). 


Let N be the homomorphism of W into the semi-group of all integers N. We 
shall refer to Nw as the norm of w. Since N isa homomorphism, N isa character, 
and consequently the log-function L, defined thus: 


(3.9) L(w) = log Nw 
is an additive function. Thus, it follows from Proposition 2 that 


(3.10) (fag) L = flag + fegL for all f,g € C(W). 





oe 


(3 


di 


rl 





ABSTRACT PRIME NUMBER THEOREMS 87 


We shall use the notation L™ to mean L"™(w) = log™ (Nw). With the aid of 
# = Me we define as in (6, p. 44) the Mangolt-function A = A, = u*L the 
Selberg-function A, = w*L? and higher types A,, = u+L”™. We recall that 


(3.11) A(p*) = log Np and A(w) = O if w # p* for a generator p € W. 


(3.12) Ao(p*) = (2e — 1) log Np; Ao(p*g’) = 2 log Np log Nq; 
Ao(w) = 0 for w # p‘q’. 


To every f © C(W) we define an arithmetic function Vf € C(®) by setting 
(3.13) (Nf)(n) = po f(w) for every integer n > 0, 
Nw=n 
and if there are no w € W satisfying, Vw = mn then we set (Nf)(n) = 0. 


Thus (NE)(m) is the number of elements of W whose norm is n. It is not 
difficult to show 


THEOREM 1. The mapping f — Nf is a homomorphism of C(W) into the 
ring of all arithmetic function C(N). 


4. The ring of arithmetic linear transformations. Let F be the 
linear space of all complex valued functions (x) defined for all real x > 1. 
To each f € C(W), we make correspond (as in (1, 5)) a linear transformation 
S, of F, defined by 
(4.1) (S,®)(x) = >> f(w)@(x/Nw); @ € Fandallx > 1. 

The following is then easily verified. 

PROPOSITION 3. 

S49 = S; + Se: cS; = Ses; Sho = Sy De 


That means that the correspondence: f — S; is a homomorphism of C(W) 
into the ring of all linear transformations of F. 

Definition (4.1) is valid for all semi-groups, in particular for W = M (the 
integers) where in the semi-group of integer the norm is to be the identity 
map. Then clearly we have, by (3.13), 


PROPOSITION 4. 
S; > = Sy/®. 
For practical purposes we prefer to substitute for S, a different operator 


I, defined by 


(4.2) ray(e) = FL) of *) = (Sps9)) 


Nwaz Nw 


(fN—)(w) = f(w)/Nw. 











88 S. A. AMITSUR 


As we remarked above, N and therefore also N-' are characters of W, hence 
it follows readily by Proposition 2 that Propositions 3 and 4 will hold also 
for Z,. For further references we formulate this result in the following proposi- 
tion which includes also an additional simple fact. 


PROPOSITION 5. 
Tg t+ 1, = [pegs ly = Legis [gly = Itqgi Ip = Iny®, 
and 
I(@ log x) = logx . 1 — I. 


5. The space %. In the present section we extend the formalism intro- 
duced in (1) to cover the general case dealt with in the present paper. 

Let & be the space of all polynomials ¢(log x) = > a, log ’x in the function 
log x. We introduce the formal derivation D = d/d log x with all its positive 
and negative powers by writing 


(5.1) D™ log" x = (n)» log"~"x for all n > m, n > 0, 
= 0 ifm > n, 


where (”), = n!/(m — m)!if nm > mand n > 0; mcan be positive or negative. 
Thus, D® is the identity. For completeness we set (”),, = 0 if m > n. Now 
D™ acts on ®(log x) by setting: D"(> a, log’x) = > (v), a, log’~™x. 

Se Mieinadd, +s «Goss be a sequence of complex numbers, then the 
symbol 


2) 


F(D) = > aD’ 


v——p 
will be considered as a linear operator on £, by putting 
oo n 
(5.2) F(D) log"x = >> a,D” log"x = >. (n),a, log*’x. 
v=——p v=—p 


Let f € C(W), F(D) be as above. Then we denote by R, (x; f, F) the remainder 
element defined by the relation 


(5.3) I, log"x = F(D) log"x + R,(x;f, F). 

That is, in view of (5.2) 

(5.4) (xf, F)= > fe) io —— — : (n) wx, log” "x. 
Nw<z 


As in (1) we shall write 
(5.5) I, 


to mean 


F(D) + O(¢,) 


R,(x;f, F) = O(¢,(x)), for all m > 0. 


The notations R,(x), R,(x;f) and R,(f) will replace R,(x;f, F) when no 
confusion will be involved. 








ice 
lso 
Si- 


ve. 
ow 


ler 





ABSTRACT PRIME NUMBER THEOREMS 89 


For further references we fix 


F(D) = > a,D", G(D) - > B,D”. 
-=—p u=—¢ 


The following is easily verified. 


THEOREM 2. (1) ak, (x;f, F) + BR,(x;g,G) = R,(x; af + Bg, aF + BG) 
(2) R(x; fL, — F’) = log x-R,(x;f, F) — Rasilxsf, F) 


where F’ = >) va, D’— is the formal derivative with respect to D. 


The proof of (2) follows as in the proof of (4) of (1, Theorem 4.1). 
Another simple result which is of great importance in the present paper is 


THEOREM 3. 


n+p 


R(x; g*f, GF) = 1,Ra(xif, F) + SO (n!/j!an—Ry(x; g, G) 


j=0 


t+1 


ql 
= } , n+ 1498-5 (m!/t!) log ‘x. 


t=0 gan() 


This will be used mainly in the following form, (noting that a_, # 0) 


n+p—1 


(5.6) Rnsp(x;g,G) = cl,Ra(xif, F) + D> c)Ry(xig, G) 
j=0 


q—1 
+ > &,,log'x + dR, (x; gef, GF). 
t=0 
for some constants ¢, C;, Cx, d. In both formulas if m + p < 0, the term con- 
taining R,(x,; g, G) does not appear, and if g — 1 < 0 the last term is not to 
be considered. 
We note also that G(D) F(D) is the formal product of the two power series 
in D and not the product of the operator G and F; the two products are not 
always equal as can be seen by: 1 = (D~'D)1 # D-'(D1) = 0. 


Proof. 


Tx; log"x = I,(I,log"x) = 1] ¥ (n) a, log” "x + R(x; f, F) | 


v=—p 


1,Ra(xif, F) + D> (n) I, log””’x 


v=—p 


ll 


1,Ra(xif, F) + >> (nm) Ra-»(x; g, G) 


v——p 


+2 > (n) a,(n — v),B, log” "x 


——p p=—¢ 
A+B+C = (GF) log’x + R,(x; gef, GF). 


The terms A, B appear in the statement of Theorem 3 (by setting j = m — v) 














90 S. A. AMITSUR 


and if m + p < 0, we do not get B. To complete the proof of Theorem 3 we 
have to compare 


[G(D) F(D)} log"x = > ( ; a8.) (n)» log” *x 


k=—(p+q) y+pmk 
with 


C= EE we — rath logs = F(T ash,) (nde log", 

—p p=—¢ k=—(pt+q) \ »+umk 
which is obtained by setting vy + u = k. The difference between the two is 
that in (GF) log*x, the sum ranges over all » > — p, u > — g, whereas in the 
second sum it ranges only over: n»>v>—p, n —v» > u> —g. Comparing 
the two we observe that they have common range as long as min (m, k + q) > 
vy > —p with k = »v + uw < n. Thus the terms for which k + g > v > nm show 
that: 


C — (GF)log’*x = — >> ( » " au) (m)s logs 


k=n—g+l1 v+p=k 


as ms 
= , o > (m!/t! ons 1+s8—-s log ‘x, 
t=0 s=1 
since in >>”, » > m; hence, the last form is obtained by setting t = n —k 
and s = —y, as theny =k —p=n+i+s. (If g = 0, this term does not 
appear, since k + g < n.) 

The relation (5.6) is very useful in computing R,(x; f-', F-') by induction, 
since it provides us with a recursive formula for R,(x;f-', F-') as will be 
used later. 

Another formula for R,(g+*f) has been obtained in (2) following the Dirichlet 
hyperbola method for summation. This result has been proved only for the 
integers (Theorem 1 of (2)), and we formulate it here for the semi-group W, 
but the two results are equivalent as is readily seen by the equality J, = I yy, 
which leads to the relation: R,(x;f, F) = R,(x; Nf, F) for all f © C(W). We 
quote that result in the following theorem. 


THEOREM 4. Let yz = x;1 < y < x then 


_ > gw) (.. ) f(w) (s., ) 
R(x; ef, GF) = Zz, Nw R, Nw J Ff) + Z, Nw Rn Nw’ g G 





-> (")r sie G)Rj(2;f, F) 


. n! m 
i=1 z, (4 Fin + jy Rats 86) log*’s + 





q i 


n!' = 
u > GaP iGe tpi P-Rersife F) log'~’y, 








le 
1g 





ABSTRACT PRIME NUMBER THEOREMS 91 


and the respective terms do not appear if the power series of F(D), G(D) do not 
have negative powers of D(that is, p < 0 or g < 0). 


The following lemma will be used extensively in §6. 


LEMMA 1. Let 


let o(x) be a differentiable function. Then 


X sew) eww) = @)e@) - f edew, 
or more generally 
y<Nweer 


Y fiw) e(Nw) = (x) (x) — &)¢) — [ emde, 


This lemma follows immediately from (7, Theorem 421, p. 346) noting that 


> fw) = > (Nf)(n). 


Nwe<r n<qz 


6. Approximating /,. In the following two sections we consider functions 
f € C(W) with properties 


(6.1) SA = >> fw) = ax + O(x/log’x), y=1+5>0, 
Nw<z 

(6.2) Sipl = »D \f(w)| = Ax + O(x/log’x), 
Nw<z 


or the weaker condition: 


(6.2*) > |f(w)| = O(x). 


Nw<r 
For later applications we shall introduce the assumption 

(6.3) f-' exists in C(W) and |f-'(w)| < K\|f(w)| for some K > 0,andallw ¢ W 
These function will satisfy 
PROPOSITION 6. 


(6.4) TA = > (Nw) 'f(w) = alogx +a0+ p(x), p(x) = O(log™*x) 
Nwert 


(65a) Spl= > f(w) log Nw = ax log x — ax +a + O(x log~* x) 


Nw<r 
(6.5b) Syr21 = > f(w) log’Nw = ax log’x — 2ax log x + 2ax — 2a 
Nwer 
+ O(x log* *y), 


The proof follows immediately from (6.1) by applying Lemma 1. We observe 


also that, since 6 = 1 — y > 0. 














92 S. A. AMITSUR 


a=a +f O(t™ log~ t)dt; p(t) = O(log~ x) + f O(t" log™ t)dt 
1 z 


é 


= O(log x). 
In what follows we determine an approximation of J; assuming only the 


validity of (6.4), and to simplify results, we assume henceforth that 6 = y — 1 
is not an integer. 


From Lemma 1 we obtain, for 2 > 0, 


Nw<r 


I, log" x = > (Nw)~*f(w) log" (x/Nw) = — J (a log t + ao + p(t)) 
1 


d log” (x/t) 
= (n + 1)~' alog”*’ x + a log" x — > (*) (—1)" tog” "x | p(t)d log’ t 
1 


v=l1 


- n+ n —1)"" - - v 
(n + 1)~* @log"** x + a log" x + > J p(t)t' log”' t dt- 
1 


l<v<d (y - 1)! 


nt ~log"~” x 
———log = 
y)! 


(n 


+ > (-1 »(") log”” x | p(t)d log*t — >> (—1 (*) log”” x 
Ne 


z v>é 


{ owe log’t. 
Ji 


This is true since, for »v < 4, 


frowe log” t = | t~'p(t) log” *tdt < 
1 1 


as p(t) = O (log~*x). If m < 6, we disregard the last term. Put 


f « 
| F(D) = > a,D’ with a_; = a, ap as given in (6.4) 
v=—1] 
(6.6) jo = Ofor v > 3, 
v—1 Vor v—1 
= ¥ 
la, = ta J p(t) log’ t forl < v < 6. 
\ (y _ 1)! 1 t 


Thus we have obtained that J, log*x = F(D) log"x + R,(x;f, F) where 


(6.7) R,(x:f,F) = > (-1)°(*) log”~” x io. log’t — >> (-1)°(") 
v<é vr 6 


> 


log””” x f p(t)d log’ t 
ei 
and for the case nm = 0, we have clearly by (6.4) 
(6.7*) Ro(x;f, F) = p(x) 


Aj. 


If m < 6 we can obtain a better form for R,, namely 














(6 





ABSTRACT PRIME NUMBER THEOREMS 93 
(6.8) R,(x;f, F) = é (-1)°(") log"" x- f p(t)d log’ t 
v= z 
= J p(t)d log"(xt™') = (—1)" | p(xe") du”, 
the z ei 
=| where the latter is obtained by setting u = —log (xt~'). 
If (6.3) holds for all 6 > 0 (that is, (6.1) is valid for all y > 0) then we 
define a, by the integral of (6.6) for all vy > 1. Furthermore, it follows readily 
from (6.8) that, for all nm < 4, 
(6.9) R,(x;f, F) = + | p(xe")du" = O(log” *x). 
v/t) 0 
» Thus we have 
ge ft 
THEOREM 5. If 
> (Nw) 'f(w) = a; log x + ao + O(log™ x) 
Nw<er 
"x for all 6 > 0 then I, = F(D) + O(log~*x) for ali 6 > 0, and F(D) is as given 
in (6.6). 
In many cases we can obtain a better bound for R, (x; f, F). 


If p(x) = O(x-*), 3 > 0, then one readily obtains from (6.8) that 


et. =) 
R,(x;f, F) = o( f eau) = O(x~*). 


COROLLARY. If 


0 f(w) = ax + O(x"”) 


Nw<z 
then I, = F(D) + O(x~*). 
Now, if (6.9) is valid only for a bounded 4, then we can only show 


THEOREM 6. R,(x;f, F) = O(log"~*x) with F(D) as given in (6.6). 


Indeed, for v < 6, 


J p(t)d log” t = O(log* x) 


Zz 


and for vy > 6 
| f owe log’ t = O(log*™ x). 
1 


” t Thus our theorem follows immediately from (6.7). 
Applying Theorem 2 to this approximation of J, yields 


THEOREM 7. 


In = — De va,D” * + O(log"**~ x), 


v= 














94 S. A. AMITSUR 


and generally 
I jum = (—1)"F™(D) + O(log**"~*x), 


where 
F(D) = > a,D’ 
y=—1 
is given in (6.6) and F‘™(D) denotes the mth formal deriwative of F(D) with 
respect to D. 


The coefficients a_; = a, ao, a;, A are defined in (6.2) and (6.6). We have to 
deal separately with the following cases. 


CaseI a_, # 0 (this implies that A ¥ 0). 
Case II a_, = 0, ao # 0, and A # 0. 
Case III a_y = ap = 0 (this will imply that A # 0). 
Case IV a_, = 0, a9 ¥ Oand A = 0 
Our first purpose is to show that 
THEOREM 8. 
| O(log" x) in Cases 1, 1V 
I,-1 = F-*(D) + O(1) + } O(log"**~* x) in Case II 
| O(log"*?~* x) in Case III. 
and the four cases contain all possible conditions on the coefficients. 


We shall need the following lemma. 
LEMMA 2. Let h(w) > 0 satisfying 


Sl = >> h(w) = O(x). 


Nw<ar 


Let g(x) = O(log’x) be a non-negative bounded function in finite intervals, then 
Ing = O(1) + O(log’*'x). If S,l = O(x/log’x),r >1 then: Ing = O(1) + 
O(log’x). 


Indeed, let |g(x)| < K log 'x for x > a, then 


Ing] < SS (Nw) *h(x)|\g(x/Nw)| + K >> (Nw) *h(w) log’ (x/Nw) 


za~!<Nwer Nw<za~! 


< sup \g(t)| (xa~*)~*-O(x) + K(xa~*)~ log’a-O(x) 


1< t<a 
— Kf O(t)d[t™ log’ xt~*] = O(1) + O(log’** x) 
1 


as can readily be obtained by substituting u = x/t in the integral. 
The second result follows similarly if we observe that 


‘ 


$ 
7 





for 
is ( 


we 





—— 


vith | 


> to 


then 


+ 











ABSTRACT PRIME NUMBER THEOREMS 


f O(t log~’ t)d[t~* log’ xt™"] < K fate tog’ xt" + 
1 1 


1 


+ Lf t log~" td{t* log” xt~*) 


for some 1 < s < a~', and some constants K, L > 0. Clearly, the first integral 
is O(log’x) and the second is: 


za~! 
= of f * log" # log” (xt) = O(log’x). 


As a special case, if f(w) satisfies (6.2) and (6.3) and g(x) is as above, then 
we have 


COROLLARY 2. 
Iy-1g = O(Iipilg|) = O(1) + Ollog’"*'x) if A #0, 
I,-1g = O(T\z;\g|) = O(1) + O(log’x) if A = 0. 


Indeed, by (6.3) it follows that |f-'(w)| < k|f(w)| for some K > 0. Thus, 
T,.1g = O(1;;\\g|) and the rest follows by the preceding lemma. 
Next we prove 


ProposiTIon 7. If f satisfies (6.1)-(6.3) and 6> 1 (that is, y > 2), then 
one of the coefficients a_;, ao, a, of (6.6) is not zero. Furthermore, if a; = ao 
= Othen A #0. 


For, let a_; = ap = 0, then from Theorem 6 we deduce that J, log x = a; 
+ O(log'~*x). Applying J;_; on both sides and using the preceding corollary, 
we obtain 


log x = I,-1T,log x = a, I-11 + O(log?-*x) + O(1) = a J,-11 + o(log x) 


since 2 — 6 < 1, and this can only be true if a; + 0. Moreover, in this case, 
it follows in view of (6.3) that: 


lay" log x + o(log x)| < |I,-11| < KJ\,,1 = AK log x + O(1) 


hence A + 0. This proves also that the four cases described in Theorem 8 
cover all possible cases. (Note, that at this point only in Case IV we assumed 
7 > 2.) 


Remark. If in (6.2), we assume that A = 0, then clearly a_; = 0, since 
Il] < I\,,1, and in this case it follows that ao + 0. The latter is then true 
even for 6 > 0, since we can use the better bound given in Corollary 2 for 


I ,-1 O(log'~*x). 
So that if ag = 0 we would have 


log x = I,7-1I, log x = ayTy-11 + I,-10(log'~*®x) = a,J,-11 + O(log'~*), 











96 S. A. AMITSUR 


which implies that a; + 0 even for 6 > 0. But then 
la,~' log x + o(log x)| = |Z,-11] < I\,,1 = Ollog~*x) 


which is a contradiction. Hence, ao + 0. 
We are now in position to prove Theorem 8. 


Case I. Since a = a, +0, F-'(D) = a“'D +... and therefore 
Ij11 = F-(D)1 + Ro(x;f-, F-'). 
That is, 
Ro(x;f-', F-!) = I,-1. 


To evaluate this element, consider the following. 


1L = S,-1S,1 = S,-1[ax + O(x log—x)] = axI;-11 + xI,-10(log—x) 


= axRo(x; f-', F-') + xO(log—*'x) + xO(1). 


Since Sx = xJI,l, now since —5 = 1 — y, we have shown that Ro(x; f-', F-') 
= O(1) + O(log-*x). We complete the proof of this case by induction on n. 


Observing that: 
O = R,(x; «, 1) = R, (x; f-'sf, F-'F) 


we obtain by (5.6), (where p = 1) in view of Theorem 6 and Corollary ‘ 


Razilx;f-, F-*) = O[1;-1Ra(x;f, F)] + (= IR; (x; f’, Fy|) 


= 6[1\,O(log”*x)] + >> O(log’ *x) + O(1) = O(log"**“*x) + O(1), 


j=0 


which completes the proof of this case. 


Case II. The proof follows by a similar application of (5.6). In this case, 
p = 0 and we need no special method for computing. As we have by (5.6) 


n—1 
R,(x;f-', F-’) = Ol1)-:Ralx;f, F)) + ( = IR; (x3, F|) 


where for n = 0, the sum does not appear. Thus using again an induction, 


together with Corollary 2 and Theorem 6, we obtain that 


R, (x; f-!, F-!) = O(log"*!-*x) + O(1). 


Incidentally, this provides the proof for Case IV also, since there . 


and we can use the better approximation 
T,R,(x;f, F) = O(log"-*x) + O(1) 
which will yield in Case IV 
R, (x; f-', F-') = O(log*~*x). 


Ce 


l, 


Inde 


whe 
add 


for 


pre 
pro 
pro 
7 is 





5) 


n, 





ABSTRACT PRIME NUMBER THEOREMS 97 


Case III. Again we use the same procedure, but here p = —1. So that 


n—2 
Ry-s(xif-', Fo’) = O[T,-1Ra (xf, F)) + (= IR, (x3 f™", Fy) 
j= 
and we thus obtain R,_,(x;f—', F-') = O(log"*'~*x), which completes the 
proof of Theorem 8. 
It follows now readily from Theorem 3 that 


THEOREM 9. For m > 1: 


I = (—1)"F \(_D) F™ (D) + O(1) + jo log"*"**~*y) in Cases I-IIJ 
se \O(log"*"~*x) in Case IV. 


Indeed, Theorem 3 implies 


n+p 


R,(x;f-'efL") = I,Ra(fL") + o( © IR; (f-’) ) + O(1) 
j= 

where D~? is the first power of D appearing in F‘™(D), and O(1) has to be 
added only if a_; = a = 0, since then F-'(D) = a_,-'D“" +... (that is, 
q = 1). 

Since F(D) has at most one negative power of D, that is, D~', the mth 
derivative may have the lowest power D~‘"*"), thus p < m + 1. Furthermore, 
in view of Corollary 2 and Theorem 7, 


T,.:R,(f{L") = O(log"*"*'!~-*x) + O(1). 


The other terms can get at most to this power, by Theorem 8, which proves 
Cases I-III. In Case lV » = 0 and we can apply the better bound of Corollary 
2 to yield the required result. 

In particular this leads to 


CorROLLarRY 3. If 


> f(w) = ax + O(x log~*x) 


Nw<er 


for all 6 > 0 and then 
Ty-aggu™ = (—1)"F-"(D) F™ (D) + OCI). 


This includes the known results (1) about J,, 7,, 7,, where wu, A, As are 
the Mobius’, Mangoll’s, and Selberg’s function for the integers, respectively. 
More applications will be given later. 


7. Approximating characters. Let f be a character on W, then the 
preceding results can be further refined in the direction of the ‘elementary 
proofs’ developed in (1). This can be achieved relatively easily following the 
proofs of Theorem 9.1 and 9.2 of (1)—only if we assume that y > 3 where 
7 is given in (6.1). We shall outline the proofs of this fact later. 














98 S. A. AMITSUR 


In the present section we want to obtain results which will give us the proof 
of Theorem A and B even for y > 2. We will be able to obtain the result that 
> f(w)A(w) = x + o(x) if y >2 in most cases, whereas the relation 
> f(w)Nw-'A(w) = log x + ¢ + o(1) will be obtained only for y > 3. 

In the rest of this section we assume that 

(A) f is a character satisfying (6.1), (6.2), and A +0, y = 1+46>2. 
Since f is a character, it follows by Proposition 2 that f-'(w) = f(w)u»(w) 
which shows that f satisfies also (6.3). For these characters we show 


PROPOSITION 8. 


(7.1) Syl = >> f(w) A(w) = O(x) 


Nw<z 
7.2) Sigal = D> |f(w)| As(w) = 2x log x + O(x) + O(x log? *x), 
Nw<z 
(Selberg’s formula) 


(7.3) 





> f(w) A(w) | < 2(t — 1)x + o(x) as (t, x) > (1, @). 
r<Nw< tz 


Proof. It follows from Proposition 2, that since f is a character, 
(7.4) f-' = fu; fA = f(ueLl) = f-'*fL and fA, = f(u*L?) = f-'*fL’. 


As the mapping g — gL is a derivation in C(W), we have: (u*L)L = pwleL 
+ pel? = — (peueL)*Ll + pel? = — (uel)? + (u*L?). Hence: 


(7.5) fAe = fA? + fAL. 
Now |f| is also a character, hence it follows by (6.5a) that: 
x* > |f(w)| Aw) = Liga = Tig-Digex = [ypy-e Sick 
T\p\-1x [Ax log x — Ax + A + O(x log™*x)] 
Al iy\-1log x — AT y)-11 + x ASip)-11 + I) 7)-10(log™*x) 
O(1) + O(log’ *x) = O(1) 


Il 


which follows immediately by Corollary 2 and Theorem 8. This gives the 
proof of (7.1). The proof of (7.2) follows similarly by use of (6.5b). Namely 


x* > |f(w)| dew) = Fipi-1(Tipieex) 
= I\,\-1[A log’x — 2A log x + 2A + O(log**x)] 
= 2log x + O(1) + O(log” x), 


since by Theorem 2 J;,;-1 log’x = (A-'D + ...) log*x + O(log?~®x). 
The proof of (7.3) follows now by standard methods from (7.2) and (7.5). 
That is 








y« 





of | 


on 


w) 


[aL 





the | 
nely | 


=~! 
or 











ABSTRACT PRIME NUMBER THEOREMS 99 
0 = a, If(w)| Aw) .. | _fw)| Aw) log Nw log~'x 


< log~*x _ As(w) = log™*x(2tx log tx — 2x log x) + O(x log™'x) + 


r<Nw<ite 
O(x log’ ~*x) = (2¢ — 1)x + o(x). 


The ‘‘elementary proofs” lie in the following refinement of (2, Theorem 4). 


THEOREM 10. Let g(w) € C(W) be a non-negative function satisfying 





(gl) > g(w) = Mx log"x + o(x log"x); M>0O,n>1. 
Nw<t 
Let h(x) be a real a complex-valued function which satisfies 
(hl) h(x) = O(1) 
(h2) > vo h(v) = O(1) 
a 7 
(h3) h(tx) — h(x) = o(1) -s (t,x) — (1, @). 
Then the condition 
‘ n+1 g( w) | (z.) n+l. 
(g2) \h(x)| log”” "x < 1 7s ‘Nw ih Nw | + o(log”” x) 
implies 
h(x) = o(1). 


This theorem has been given in (2, Theorem 4) with the condition 
> (Nw) 'g(w) = a log"**x + b log"x + o(log"x) 
which is stronger than (gl), since (gl) implies only that 


(7.6) J,l = ye (Nw)~'g(w) = [Mx log"x + o(x log"x)}x™ 


xr 
f [Mt log"t + o(t log"t\t~*dt = (m + 1)~'M log”*'x + o(log***x). 
1 
To prove this theorem, we first observe that for given ¢ > 1, we can find 
x; such that, for x, y satisfying xy~! > x;, the following is valid: 


(7.7) Zz (Nw)~*g(w) > Clog"(xy*) for some C > 0. 


y<z/Nw<yt 


Indeed, choose 6 (to be fixed later) then there is x; such that for x > xz, 


the absolute value of the error term in (gl) is < 6 log*x. Then it follows by 
(g1) that 


> (Nw) "g(w) > xx" SO gw) 


v<z/Nw<yt 
> yx" [Mxy™ log"xy™* — Mx(yt) log"x(yt)~* — 2éxy~ log"xy~"] 
> M(1 — £") log*xy~* + Mt-[log*xy~* — log"x(yt)~*] — 28 log*xy™ 
> M(1 — t* — 28) log"xy™' 














100 S. A. AMITSUR 


and (7.7) is true if we choose C = M(1 — t-'! — 26) > 0, which can be 
fulfilled as ¢ > 1. 

By the standard method of Selberg’s proof (1, Theorem 6.1 and 2, Theorem 
4) one can obtain the following. 

(7.8) Given A > 0, there exists x,, 7 > ¢ > 1 such that for x > x,, there 
is y,x < y < yt < xT with the property that for all y < z < yt, |h(z)| < A. 

We turn now to the proof of Theorem 10, which contains only a more 
careful repetition of the proof of (2, Theorem 4). 

Let lim sup|h(x)| = A. If A > 0, choose A = 3A and fix x,,T >t>1 
satisfying (7.8). For this given ¢ we choose x; to satisfy -(7.7). Now for given 
e > 0, let |h(x)| < A + € for all x > X,. 

Denote by y; the element y given in (7.8) for x = 7‘ > -x,, that is, 
T'<9¥i< yet < T and put £ = log xo/log T where xo = max(x,, x,), 
and » = log(xx;—')/log 7. Thus for each —§ < i < 9, T* > x9 and x7‘ > x. 

It follows now by (9.2) in view of (7.7), (7.8), and (7.6) that 


|h(x)| log"**x < (n+1)M~* D0 (Nw)~'g(w)|h(x/Nw)| 


z(Nw) ~!<zo 


+(n+IM(A+e YD (Nw) 'g(w) 


rzo—'<z(Nw) ~'<z 


+ > > [A— (A+ e)](Nw)~"g(w) 


E<i<g yi<(Nw)-'!2<yit 


<K DY (Nw) 'g(w) + (n +1)M—(A + &)[M(n + 1)* log”*'x 
<2 


zzq~!<Nw 


+ o(log"*'x)] + [A — (A + 0)]C & log"(x7T~*"), 
E<i<y 
since A— (A +e€) <0 and xy;-'>xT-*'. Now, the first term is 
< Kxox—'( Mx log"x + o(x log"x)) = o(log"*+'x). For the third term we have 
by Lemma 1, 
Dd log"(xT~*-T~*) = — [€] log"xT~** + [n] log*xT** 


t<i<y 
r 
bt f [u]d log"xT** -_ (n + 1)** log~'T log"**x + o(log”**x) 
E 


as follows immediately by standard method of replacing [u] by u and noting 
that Tf = xo, 7? = xx;'. 
Thus 


|a(x)| <A +e + (A —A — C/M log T + O(1). 


As x— © with |h(x)| +A we get A<A+e+(A—A-—0C',C'’>0. 
But this cannot be true for all « > 0 since (A — A)C’ < 0. This contradiction 
leads to the conclusion that A = 0. 

We apply now Theorem 10 to the following function: g(w) = |f(w)|A2(w) 
and 








lave 


+e) 


ting 


> 0. 
‘tion 


2(w) 





SO re 





ABSTRACT PRIME NUMBER THEOREMS 101 


—1 in Case I 
(7.9) h(x) = > f(w)A(w) + 4 © in Case II 
Nwarz | 
(+1 in Case II] 
since 2 — 6 <1, it follows by (7.2) that |f(w)|A2(w) satisfies (g1). Clearly, 


(7.1) means that h(x) given in (7.9) satisfies (hl). Condition (h3) follows by 
(7.1) and (7.3), 


\h(tx) — h(x)| < | DL J(w) Aw) - 1)x"| + 





E sflwy's*a(w)| 


r<Nw<ets 


< K(t* — 1) + 2¢ — 1)t°' + o(1) = o(1) as (t,x) > (1,@). 














To obtain (h2) we put ¢ = 1 in case I, ¢ = 0 for case II, and ¢ = —1 
in Case III: 
> v"h(v) = Zz. y , # f(w)A(w) —o > v ‘| 
ver ver Nwer vez ' 
=| > fw)aAw)- > v* — ologx + O(1)| 
Nwaz Nwcreer 
= | . i f(w) A(w[(Nw)* — x' + O(Nw™’)] — o log x + O(1) 
< Dit De + D+ O(1), 
where 


>: = je" f(w) A(w) | = O(1) by (7.1), and 
Di: = 


as follows immediately by (6.5b). 

We can conclude from Theorem 9 that }°; = |J;, — o log x| = O(1) + 
O(log?-*x) which is O(1) if 2 — 6 < 0, that is, 6 > 2 or y > 3. From this we 
can conclude Theorem A for y > 3. To obtain our result for y > 2 we need 
a refinement of Theorem 9, which we can carry out only in the following form. 


-o (Nw)*|f(w)| A(w) | < > |f(w)| log Nw(Nw)”* = O(1), 


Nwet Nwaer 





THEOREM 11. Jf f is a character satisfying (6.1) and (6.3) with A +0, 
then 


| O(log*~*x) in Case I 
Ij, = — F"(D)F(D) + O(1) + uae in Case II 
O(log"***x) in Case III. 


Before proceeding with the proof of this theorem, we observe that with 
these results it follows now that 5°; = O(1) if 6 > 1 in Cases I and II, and 
only in Case III we have to assume that 6 > 2. To complete the proof of our 
first main theorem, we establish 














102 S. A. AMITSUR 
THEOREM 12. Jf f is a character satisfying (6.1) and (6.2) and A +0, 
then 
x + o(w) in Case I and y > 2, 
Dd f(w) Aw) = {  o(x) in Case II and y > 2, 


Nwer 
| —x + o(x) in Case III and y > 3. 


We still have to prove the validity of (g2). Indeed by (6.5a) 
Iyh(x) = I(x" Spal) = x *S,Syal — ol fl 
- O(log *x) — ao in Cases II and III 
= 3 l = 1 = - 
«Sin ots on *x) —a; —ao in Case I 


since [~' = x—'S, (as operators) and Sy, = S,;-1S;,. Hence if a = —ap or 


Tm log x Iph(x) = aly, log x + I,,0(log'~*x) 
= O(1) + O(log?-*x) = o(logx), 
by Theorem 8 and Corollary 2. We now proceed similarly to (1, Lemma 6.1) 
Ty, log x Ty = I(T, log x + Ipz) = log x + Iya. 
It follows therefore that 
\h(x)|log x — Typ, alh(x)| < |(log x + Iya)h(x)| = |I,log x Ih(x)| = o(log x). 
As in (1, Lemmas 6.3 and 6.5) we obtain 
(log*x — Zizi a,)|A(x)| = (log x + Tis; )(log x — Tip, )|h(x)| 
< (log x + Is; )o(log x) = o(log*x). 
That is, 
|h(x)|log*x < Iir;a,|h(x)| + o(log*x) 
which proves (g2), after verifying easily as in (1, Lemma 6.3) with the aid 
of Theorem 11 that J),;,0(log x) = o(log?x). 
We return now to the proof of Theorem 11. From Lemma 2 and (7.1) it 
follows that 
(7.9) I\7;sO(log’x) = O(1) + O(log’*'x). 
The proof is similar to the proof of Theorem 8. It follows by (6.5a) that 
ax log x — ax + a + O(x log~*x) = Syzl = SpaSsl 
= Syslax + o(x/log'**x)] = axI,41 + xI¢,O(log—'~*x) 
= axI;,1 + xO(log~*x) + O(x). 


Thus, if a +0, we have: J;,l = log x + O(1) + O(log~*x) which yield 
Ro(x; fA, — F’F-') = O(1) + O(log~*x). 











+ 0, 


ao or 


1 6.1) 


g x). 


og?x). 


e aid 


1) it 


hat 


“1—8y) 


yield 





ABSTRACT PRIME NUMBER THEOREMS 103 


It follows now from the relation fL = f*fA and by (5.6) and Theorem 7 
using induction that 


ll 


cl saRy (x; f, F) + (x |R, (x; fA, -F'F*)) 
4S 


jm () 


Resi(xi fA, —F7'F’) 


+ O(1) + Ra(x;fA*f, —F"'F’-F) 
T,,sO(log**x) + O(1) + R,(x;fL, — F’) + O(log"~*x) 
O(log"***x) + O(1), 


which prove the first case of Theorem 11. 
The other cases follow as in Theorem 8: 


n+p—1 
Resp(xif A, —FU'F’) = cl yaRa(xif, F) + o( ¥ IR; (x; fA, - FP) 


j=0 


+ O(1) + R,(x;fL, —F’) = O(log"***x) + O(1). 


In Case II, = 0 and in Case III, p = —1, which readily imply by induction 
the other two cases of Theorem 11. 
We conclude this section with the last case a = A = 0 (which implies 


ay + 0). Here we do not have to use Theorem 10. The proof of (7.1) which 
leads to (7.9) holds in this case, and consequently, Theorem 11 (Case IT) 
is also valid. Writing 


h(x) = x7" > A(w) = x *Syal. 


Nwaz 
As in the first part of the proof of Theorem 12, we obtain 
h(x) log x + I,sh(x) = I,,log x I,(x~"S,al) = Ix log xSyxl 
= 1, ,O(log**x) = O(1) + O(log” x). 


The power series in D corresponding to J\,; will be of the form G(D) = 
Ay +A,D+..., and Ay # 0 (by Case IV of Theorem 8). Thus it follows 
from Theorem 11 that 


Tigial = — Ao~'A, + O(log'~*x). 
Thus, since A(x) = O(1) 


|h(x)| log x < Ip; a|h(x)| + O(log?-*x) + O(1) 
< O(1).I; 7,41 + O(log*-*x) + O(1) = O(log*-*x) + O(1). 


Consequently |h(x)| < O(log'~*x) + O(log~'x). That is, 


THEOREM 12*. If f is a character satisfying 


> f(w) = O(x/log’**x) 


Nwezt 


and 














104 S. A. AMITSUR 


>. |f(w)| = O(x/log’**x), 


Nwer 


then 


bi f(w) A(w) = O(x/log**x) + O(x/log x). 


Nwcst 


8. Proofs of Theorems A and B. We return now to the situation of 
§2. Let H denote a generic class W/W’ and let ey(w) be the characteristic 


function of H, that is, eg(w) = 1 if w € H and zero otherwise. From the 
properties of characters (2.2) and (2.3) we have the relations 
(8.1) x= ) x (A ex, ex = ho ym x(H) x. 
H x 

We assumed in (2.4) that 
(8.2) S,1 = > 1 = eax + O(x/log’x), > ce > 0. 

> 
Thus 


Sl = > x(H)S,_1 = A,x + O(x/log’x), Ay = > x(A)ew, 
H H 


and for the identity x» = E. Also 
S,,1 = cx + O(x/log’x), Ay, =c= >) q>O0. 


The characters are thus functions of the type which were dealt with in the 
preceding sections. Let 


L,(D) = De L(x)"; L_s(x) = Ay 


be the polynomial corresponding to J, in (6.6), then we distribute the characters 
of K in three classes 


Ty = txix € K, A, = L-i(x) $ 9}, 
= 0, Lo(x) + 0}, 
T's = {x:x € K, A, = 0, Lo(x) = 0}. 


| 
- 
~ 
a 
be 
| 


Theorem 8 now implies that (all characters are in our case subjected to 


CasesI-III): 
COROLLARY 3. 


[ O(log*~*x), x €TPr; 
Ty, = Ly'(D) + O11) + 4 O(log"***x), x EPs: 
| O(log"***x), x € Ps. 


From Theorem 12 we now obtain Theorem A. 









n of 
istic 


the 


| the 


cters 


d to 








ABSTRACT PRIME NUMBER THEOREMS 


Theorem A and (8.1) yield 


> en(w) A(w) = ye A(w) =h* > x(H)x(w) A(w) 
Nwaez Nw Nwer 
am 


= b> x(H) - > Ne + o(x) = dyx + o(x) 


From here we can follow the ideas developed in (4), but replacing the ‘‘Dirich- 
let density” k of a set S, defined there, by the sum 


> d(w)~'A(w) = k log x + O(1) 
by dealing in a parallel way with the sum 
> A(w) = kx + o(x) 
and by calling & the Dirichlet density of the set S. As the reasoning is identical 
with that of (4, p. 602) as well as the passage from 


va(x) = >> A(w) 


Nwa<z 
weH 


to 


Ty(x) = a ] 


we just quote the final results. 

(a) IT, is a subgroup of index 1 or 2 in the group T, U rs. 

(b) Let U = {w; x(w) = 1,x € Ti}, then W’ C UC W and U/W’ has 
lr, as the group of characters. Put U* = {w;w € U, x(w) = 1 forall x € Ts} 
then U D> U* D W’ and Theorem B is valid for these groups U and U*. 
(Compare with (4, Theorem 3.1).) 


9. Other ‘‘elementary results.”’ In the present section we shall outline 
the extensions of the elementary proofs of (1) to our case. These lie in the 
following extension of (1, Theorem 9.2). 


THEOREM 13. Let g € C(W); G(D) = yD + yg Do" +... 
Let f be a character satisfying (6.1) and (6.2). Then forn <i —q— 1 


I ,-19 log"x = [F~'(D)G(D)] log"x + 0(1) 
where F(D) is given in (6.6), provided that the following conditions hold: 
(i) I,.R,(x;g,G) = O(1) fory Cn + 1, 
(ii) I\,s; log x R,(x; g,G) = o(log x) 
(iii) Z\,\1 = O(x*) where h = f~—'*g and @ < 1 








106 S. A. AMITSUR 
(iv) ) (Nw) "|h(w)| = o(1) as (t,x) (1, @) 
r<Nws iz 

(v) Case I and g < 2; Case II and g < 1; Case III and g < 0. 

Before proceeding with this proof we give here some examples as applications. 

Example 1. g = ¢ is the identity, and G(D) = 1. Thus R,(x; e, 1) = O(1) 
for all » which yields (i) and (ii) trivially. Here, h = f-'*e = f-' hence J\,)1 
= O(log x), by (6.3) and (6.2) which proves (iv). To prove (v) we obtain by 
(6.3) and (6.9) that for some K > 0, 


O< > (Nw) *|f-*(w) | <K 7. (Nw)~"|f(w)| = KA log t + O(log*x) 


2<Nws tt r<Nw<tz 
= o(1) 
as (t,x) — (1, ©). Consequently, 
COROLLARY 4. 
I,-1 log"*x = F-*(D)log"x + 0(1) for n<6—1. 
In particular this yields, for 6 > 1: 
( xe 
| o(1) in Case I 
u(w)f(w) _ | 
I, ” Be wma } ao + o(1) in Case II 
lai’ log x + asa; + 0(1) in Case ITI. 
This includes for the case f(w) = 1, one of the equivalent forms of the prime 


number theorem, but our effort to follow the classical proof of that theorem 
from this result failed, and we have obtained the prime number theorem in §8 
in a different way. 


Example 2. g(w) = (fL)(w) = f(w) log Nw, and G(D) = — F’(D). We 
shall consider only the case A +0 (in Case I, g = 2, Cases II and III, g <0). 
In this example R,(x;fL, — F’) = O/log**'-*x (Theorem 7) and thus Corol- 
lary 2 implies that 


T,1R,(x;fL, — F’) = O(log’*?-*x) + O(1) = O(1) 


for all » satisfying » + 2 — 6 <0. This proves the validity of (i) for all 
n-+3 — 6 < 0. But then (ii) also holds since 


I, log x R,(x;fL, — F’) = I,O(log"*?-*x) = O(1) + O(log**+*-*x) = O(1). 


Now h = f~'*fL = fA which implies the validity of (iii) by Theorem 9 when 
applied to |f|, since I\,,1 = O(log x) + O(log?-*x). The last condition (iv) 
follows readily from (7.3). Consequently we obtain by applying Theorem 13 
that 


COROLLARY 5. 


T,, log*x = — [F-'(D)F’(D)] log"x + o(1) if > » + 3. 








ABSTRACT PRIME NUMBER THEOREMS 107 


In particular, if ¢ = + 1,0, — 1 in Cases I, II, III respectively, then for 
65> 3: 
f(w) Aw) _ ' 
i, Nw = ¢ log x+8+0(1) 


which is another equivalent form of the prime number theorem in the case 
of integers with f(w) = 1. Note that this is obtained only for 6 > 3 (that is, 
y > 4) whereas the other equivalent for 


> f(w) A(w) = ox + o(x) 


Nwcz 
was obtained in Theorem 12 for y > 2. 

Now for the proof of Theorem 13. We wish to show that the function 
h(x) = R,(x;f~'*g, F-'G) satisfies the requirements of Theorem 10 with 
g(w) = |f(w)|A2(w). It was already proved in the preceding that g(w) satisfies 
(g1) of Theorem 10 if 6 > 1, and we now prove the validity of (g2). It follows 
from Theorem 3 that 
R, (x; g, G) = Ry(x;fe(f-'sg), 


n+p 


F(F'G)) = I,Ra(x; f~'*g, F-'G) + Do cy Ry (xi f, F) +, 
j=0 
where D~-* is the first power of D in F-'G, and for some constants c,a (a = 0 
if f is a character satisfying Cases II and III). Operating with J,-1‘log x 
on this result, we find 


I ,-: log x 1 Ra(xif'*g, FG) 
n+p 
= [,-1 log xR,(x;g,G) — yi c; Iy-1 log xR; (x; f, F) — al, loge 
j=0 
= o(log x) + O(log"*?#"*~* x) + O(1) + O(log’ x) 
= o(log x) 
ifin+p+1-—-—6 <1. Since R, (x;f, F) = O(log’-*x) by Theorem 6 the rest 
follows like the proof of Theorem 12, that is 
T,-1 log x I, Ry(x; f—'*g) = (log x + Iya) R,(x; f~'*g) = o(log x), 
from which we deduce that 
|Ra(x; f—'g)| log*x < Ty 5, 4,|Ra(x; f—'*g)| + o(log*x), 


namely, (g2). 
To prove conditions (hl) — (h3), we first observe that 


n+@ 
R,(f-'*g) = Ip+Ra(g) + Dc, Ry (f*) +a = O(1) + O(log”***x) 
j=0 
where a + 0 if the situation is of Case III. This shows that 
R, (f-'*g) = O(1) and Ryayi(f~*g) = O11) 
ifu+i+gq-—6<0. 











108 S. A. AMITSUR 


The completion of the proof follows the computation of (1, pp. 306-7). 
We do not repeat the computation but present the final result in the following 
proposition. 


PROPOSITION 9. Let l(w) € C(W) and 


L(D) = > 7D". 


Then 
(9.1) 2 wp Ra(uil, L) = (m + 1) "*Ragi(xsl, L) + O(1) 
+ O(d (Nw )~*!1(w) | log” Nw) 
and 
(9.2) Ra(tx;l,L) — Ra(x;l,L) = > (Nw) ~'l(w) log"(tx/Nw) 


Nwaer 


n—1 
+2 (") log” “tR; (x; 1, L) + O(log” 'x log**'t), 


j=0 


woth the last factor omitted if L(D) does not contain negative powers of D. 


Thus in our case, / = f—'*g, L = F-'G we observe that (iii) of Theorem 13 
yields, by Lemma 2: 


> (Nw) *|l(w)| log"Nw = O(x")x™* log"x — | O(t’)d{t~log"t] = O(1) 
Nwest 1 
which shows that R,4:(f~—'*g) = O(1) implies (h2). Condition (h3) of Theorem 
10 follows from (iv), since in our case Rj(x;/,L) = O(1) and m = 1 (in Case 
III, we have to require that g = 0) imply that 
> (Nw)~'l(w)log"(tx /Nw] = o(1). 
r<Nwse itz 

This completes the proof that if m + q¢ + 1 < 6 and (a) Case I with g < 2, 
(b) Case II, g < 1, or (c) Case III, g < 0, all conditions of Theorem 10 are 
fulfilled. Thus the proof of Theorem 13 is complete. 


10. The character f(m) = n‘*'. Weconclude our result with an application 
for the semi-group of integers and the character f(m) = n‘', t +0 fixed. 

Clearly f is a character, and satisfies J\,,1 = }> n-' = log x + c + O(x') 
and J,l = }on-'*** = c, + O(x—") where c, = ¢(1 — it). Thus this function f 
satisfies the condition of either Case II or Case III. 

If (III) is valid then we would get from Theorem 13 (Example 2) that 

I,-1#41 = Ipl = D> A(n)n™**" = — logx + ¢ + o(1). 
n<qz 


But since 





n f 





ABSTRACT PRIME NUMBER THEOREMS 109 


A(n) 


nqr 


(the case f(m) = 1) we have by Lemma 2 


—logx +c +0(1) = >) A(m)n“**" 


nar 


= log x + co + o(1) 


= [log x + co + o(1))x"* — J tog u + co + o(1)}du"* 
1 


= O(1) + J o(1)du*’ = o(log x). 
1 


Indeed, if the function |o(1)| < « for x > A and |o(1)| < K for x < \X, then 


| focayau" 
1 


Thus Case III is disposed of and there remains Case II, which means that 





oA I 
< Kt f u-'du + te f u'du < M + ed log x. Q.E.D. 
x 


1 


@ 


> n't" = ¢(1 — it) #0 


and consequently that the series 


es 


pe n~**** A(m) 


n=1 


converges for all ¢ # 0. From this one readily proves that 


> p*"‘ log p 


converges. 


REFERENCES 


1. S. A. Amitsur, On arithmetic functions, J. d’Anal. Math., 5 (1956-7), 273-317. 

2. ———— Some results on arithmetic functions, J]. Math. Soc. Japan 11 (1959), 275-290. 

3. A. Beurling, Analyse de la loi asymptotique de la distribution des nombre premiers generalises, 
Acta. Math., 68 (1937), 255-291. 

4. H. Shapiro and W. Forman, Abstract prime number theorems, Comm. Pure and Applied 
Math., 7 (1954), 587-619. 

5. K. Yamamoto, Theory of arithmetic linear transformations and its applications to an ele- 
mentary proof of Dirichlet’s theorem, J. Math. Soc. Japan, 7 (1955), 424-434. 

6. ———— Arithmetic linear transformations in an algebraic number field, Mem. Fac. Sci., 
Kyushu Univ. Sci. A. 12 (1958), 41-66. 

7. Hardy and Wright, The theory of numbers (Oxford, 1954). 


Hebrew University 











BLOCK DESIGN GAMES 
A. J. HOFFMAN anp MOSES RICHARDSON! 


In this paper, we define and begin the study of an extensive family of simple 
n-person games based in a natural way on block designs, and hitherto for the 
most part unexplored except for the finite projective games (13). They should 
serve at least as a proving ground for conjectures about simple games. It is 
shown that many of these games are not strong and that many do not possess 
main simple solutions. In other cases, it is shown that they have no equitable 
main simple solution, that is, one in which the main simple vector has equal 
components. On the other hand, the even-dimensional finite projective games 
PG(2s, p") with s > 1 possess equitable main simple solutions, although they 
are not strong either. These results are obtained by means of the study of the 
possible blocking coalitions. Interpretations in terms of graph theory, network 
flows, and linear programming are discusssed, as well as k-stability, auto- 
morphism groups, and some unsolved problems. 


1. Preliminaries on block designs. Block designs have long been studied 
from various points of view and have an extensive literature, an introduction 
and references to which can be found in Hall (8). 


By a block design® we shall mean a set NW of v elements {1,2,...,v}, and 
a family of 6 distinguished subsets W;, W2,..., W» of N called blocks, such 
that 


(a) every W, contains k elements, k < », 

(b) every element x belongs to r blocks. 
A block design may be specified by means of its incidence matrix A = ||\a,,\| 
with v rows and 5 columns, where a,;, = 1 if the ith element belongs to the 
jth block and a,, = 0 if not. The numbers 2, 6, k, r are termed parameters of 
the design. Clearly, 
(1) vr = bk, 


since each side represents the total number of ones in the incidence matrix. A 
block design is termed symmetric ifv = 6 or, equivalently, k = r. A block design 
is termed balanced if every two elements occur together in A blocks. The 
numbers v, 6, k, r, \ are termed the parameters of the balanced block design 
and satisfy, in addition to (1), the relation 
(2) r(k — 1) = AW — 1). 

'Received January 4, 1960. Some of the work of this paper was done while this author was 
partly supported by a National Science Foundation Faculty Fellowship. 


2Also referred to as incomplete block design and tactical configuration in the literature. 
Cf. (2; 3; 4; 10; 11). 


110 














oa. Gt ch 





BLOCK DESIGN GAMES 111 


A symmetric balanced block design is often referred to as a (v, k, \)-system. 
Perhaps the most familiar balanced block designs are the finite geometries, 
projective and euclidean, where the points are taken as the elements and the 
lines as the blocks; for these, we have \ = | since two points determine a line. 
Other balanced block designs which have long been studied are the Steiner 
triple systems (cf. 8; 10; 11) for which either v = 6¢ + 1, 6 = ¢(6¢ + 1), r = 32, 
k=3,A = Lorov = 6¢+ 3,6 = (2¢+ 1)(8¢+1),r = 3t+1,k =3,A = 1. 
The Steiner triple systems with v = 1, 3, 7 we shall here term ¢rivial. The case 
v = 7 is the familiar seven-point projective plane. 

A block design is termed partially balanced if: 

(A) There exist non-negative integers \;, A2,...,A, and positive integers 
Ny, Nz....,M, such that to every element x corresponds nm, other elements, 
called jth associates of x, with the property that any jth associate of x occurs 
together with x in A, blocks, and 

(B) if x and y are ith associates then the number of elements which are 
jth associates of x and kth associates of y is py". 

The numbers v, b, Rk, r, Ay,.--, Any My -- +» Mn, Pye’ are termed the para- 
meters of the partially balanced block design. It is understood that the numbers 
\,, 2;, Py’ are independent of the choice of element. We shall suppose that 
h > 1. If h = 1, so that all A, may be replaced by A and all n, by v — 1, then 
the block design is balanced. 

A partially balanced block design with two associate classes is termed group 
divisible if the elements can be divided into m groups each with nm elements so 
that pairs of elements in the same group occur together in A, blocks and pairs 
of elements in different groups occur together in A» blocks, A, # As. It is clear 
that m, = » — 1 and mz = n(m — 1). A group divisible design (cf. 2; 5) is 
termed singular if r = 4, semi-regular ifr > A, and rk = vo, regular ifr > dr, 
and rk > vas. 

Let s,,; be the number of elements common to the ith and jth blocks of a 
design; the matrix S = ||s,,|| = A7A. It is known that in a symmetric balanced 
block design all s,;, = A. It is also known (cf. 5) that: for a regular symmetric 
group divisible design, 


alr - Ay) /(r? — VA2) < Sis < Ai if Ay > Ag, 
Ai < Sij < Aa(r —_ \1)/(r? —_ VA2) if Ay < Ag} 


(3) 


for a symmetric regular group divisible design with r? — vA, and A; — Az 
relatively prime, all 
(4) Sig = Ay OF Ag; 


for a symmetric semi-regular group divisible design 


. 2ror’* ead 
(5) M< Sig < , + oe i Te Ai. 


* 


If the block design D* has parameters v*, 5*, k*, r* and incidence matrix 











112 A. J. HOFFMAN AND MOSES RICHARDSON 


A, then the dual block design D has parameters v = 6*, b = v*,k = r*,r = k* 
and incidence matrix A’, the transpose of A (cf. 3; 15). 


Restriction. We shall henceforth confine ourselves to block designs, desig- 
nated by D, of which the incidence matrices have no two column vectors 
equal, and correspondingly to block designs, designated by D*, of which the 
incidence matrices have no two row vectors equal. Thus no two blocks of a 
design D are to be equal sets. 

If D* is a partially balanced design with all A*, > 0, then in the dual 
design D every pair of distinct blocks has a non-empty intersection. If D* is a 
balanced design with \* > 0, then in the dual design D the intersection of 
every pair of distinct blocks has \* elements. 


2. Preliminaries on simple games. Let N be a finite set {1,2,... , v} of 
v players. Let MN be the class of all subsets of N, each of which is called a 
coalition. If S C N, let S* be the class of all supersets of elements of S, and 
let S* be the class of all complements of elements of S. By a simple game is 
meant an ordered pair G = (N, %) where W is a subclass of M satisfying 


(a) W = Wr 
(8) W \ W* = ¢. 


Elements of Y% are termed winning coalitions, elements of { = MN — W are 
termed /Josing coalitions, and elements of 8 = ¥&)\ %* are termed blocking 
coalitions. The simple game ‘ G is termed strong if and only if 8 = ¢. A simple 
game may be defined by specifying the class W" C W of minimal winning 
coalitions, by virtue of condition (a). Let Wi, W2,...,W, be the minimal 
winning coalitions. 


A dummy is a player 7 such that f(S U {7}) = f(S) for all S € MN where f 


is the characteristic function of the game. We shall confine ourselves here to 
strictly essential games, that is, having no dummies. We use the 0 — 1 
normalization. 


A vector (a, d2,...,@,) of non-negative real components such that 
(6) > a,=1 for S € BR", 
ieS 
(7) YS a>1 for S € (WU®B) — Q" 
ieS 


is termed a main simple vector (cf. 12; 14; 7). If there exists a main simple 
vector, then the finite set of imputations X = {x‘®|S € W"} where 


x” = Ja, if te S$ 
ae iff i¢s 


*It is possible for a design with distinct blocks to have a dual with two or more blocks equal; 
see for example, design S12 in (2). If, in a balanced design, \ < r, then two row vectors of the 
incidence matrix cannot be equal. If, in a partially balanced design, all A; <r, the same 
conclusion holds. 

‘In (12), the terminology is such that all simple games are those which are termed strong 
here and in (14). We shall use the terminology of (14) throughout. 





w! 


ce 





BLOCK DESIGN GAMES 113 


is termed a main simple solution of the simple game G. A player 1 is indifferent 
relative to the main simple solution X if a, = 0. This can occur without i 
necessarily being a dummy. We shall suppose that there are no indifferent players 
throughout. A main simple solution will be termed equitable if all the components 
a, of its main simple vector are equal. 

A necessary condition for a set X C N to be a blocking coalition is that 
the row vectors R;, i € X, in the incidence matrix, with columns correspond- 
ing to minimal winning coalitions and rows corresponding to players, shall 
have a boolean sum equal to the unit vector U,, all 6 components of which are 


ones; that is, 
> RFR. =U, 


ieX 


where the summation is boolean. 


3. Block design games. Any block design D, subject to the restriction at 
the end of § 1, in which, furthermore, every pair of blocks has a non-empty 
intersection, may be used to define a simple game, called a block design game, 
in which the players correspond to the elements of the design and the minimal 
winning coalitions correspond to the blocks of the design. Particular examples 
are the finite projective games studied in (13), symmetric balanced block 
designs, group divisible designs with all s,, > 0, the duals of finite euclidean 
planes, the duals of Steiner triple systems, the duals of balanced block designs 
with \ > 0, and the duals of partially balanced block designs with all A, > 0. 

The following lemmas will be useful. 


LemMaA lI. If, in any simple game, there exists a blocking coalition B (properly) 
contained in some minimal winning coalition W, then there exists no main simple 


solution. 


Proof. lf there were a main simple vector we would have 


> a, =1 but > a> 1. 


icW ieB 


LEMMA 2. If every blocking coalition B in a block design game is such that the 
number of players in B is greater than k, then there exists an equitable main 
simple solution. 


Proof. We can take a; = 1/k. 


LemMA 3. If, in a block design game, there exists a blocking coalition B of 
which the number of players is less than or equal to k, then there exists no equitable 
main simple solution. 


Proof. For we would have 


a,<l 


ieB 


and therefore not > 1 as required. 











114 A. J. HOFFMAN AND MOSES RICHARDSON 


4. Some theorems on block design games. We establish some theorems 
concerning blocking coalitions and main simple solutions of various block 
design games. Examples are collected in § 8. 


THEOREM 1. A block design game is not strong if one of the following con- 
ditions hold: 
(a) v = 2k, b < $C(v, k), 
(b) v < 2k, b < C(v, R), 
(c) v > 2k, and some (v — k + 1)-tuple of players constitutes a losing coalition. 


Proof. Under hypothesis (a), at least one k-tuple of players is not in QW" 
and has its complementary k-tuple also not in Y", for the number of k-tuples 


in W" U W"* is 2b < C(v, k). Hence B = LV LV* ¥ 6. 


Under hypothesis (b), there exists a k-tuple not in %" whose complemen- 
tary (v — k)-tuple is not in @ since v — k < k, while any set in ® has at 
least k members. 

Under hypothesis (c), the complement of the given (v — k + 1)-tuple is 
also in ¥ since it has only k — 1 members. 

More precise information concerning blocking coalitions in various block 
design games is given in the remaining theorems of this section. 


THEOREM 2. In any simple game, if there exists a player x, and a minimal 
winning coalition W, containing x, such that every other minimal winning 
coalition W containing x, intersects W, in more than one element, then there 
exists a blocking coalition B which is a (proper) subset of W,. Hence, under 
these hypotheses, the game is not strong and there exists no main simple solution. 


Proof. Let W,; = {x1, X2,..., Xz}, say, and let B = W, — {x,} = {xe,..., 
x,}. Now every minimal winning coalition W different from W, must intersect 
W,, and furthermore, by hypothesis, must intersect B. Consequently, B is a 
blocking coalition (properly) contained in W,. The last sentence of the theorem 
follows from Lemma 1. 


COROLLARY. The hypotheses and hence the conclusions of Theorem 2 are 
satisfied if the block design game D is any of the following: 

(a) a symmetric balanced block design with X > 1; 

(b) the dual of any balanced block design with \} > 1; in particular, the dual 
of the design formed by the s-spaces in a projective or euclidean m-space PG(m, p") 
or EG(m, p") with 1 <s < m; 

(c) the dual of a partially balanced block design with all \, > 1; 

(d) a symmetric regular group divisible design with 1 < dy < da; 

(e) a symmetric regular group divisible design with \; > 2 and d2(r — dj) > 
r? — v2; 

(f) @ symmetric semi-regular group divisible design with , > 1; 

(g) a symmetric regular group divisible design with r? — vd. and d, — dz rela- 
tively prime and both d, and de greater than one. 











an 


") 





BLOCK DESIGN GAMES 115 


THEOREM 3. Jf D* is a balanced block design with parameters v*, b*, k*, r*, 
and \* = 1, then its dual D yields a game which is not strong if either® k = 3 
and r > 4, or k>4 and r >3. In particular, under these hypotheses, there 
exists a blocking coalition with k +r — 2 members. 


Proof. In D we have bk = vr and k(r — 1) = 6 — 1, and the intersection of 
every pair of minimal winning coalitions has just one element. Consider any 
block W = {x,,...,xx} of D. There are 

r+(k—2)(r —1) = Rk(r —1) —r+2=6-—(r—1) 
blocks containing at least one of the elements x;,...,: X,-1, leaving r — | 
blocks intersecting W in x, only. In these r — 1 blocks there are (r — 1) 
(k — 1) elements other than x,. There exist (k — 1)’—' possible (r — 1)- 
tuples with one element chosen from each of these r — 1 blocks. Excluding 
W, there are (r — 1)(k — 1) blocks containing elements of {x,,..., x,-1}. 


But, if k = 3andr > 4, orif k > 4andr > 3, then 
(k—1)'"" > (R — 1)(r — 1). 


Therefore, in this case, at least one such (r — 1)-tuple {y,,..., ¥r-1} exists 
not forming a block together with any member of {x,...,: X,-1}. Thus, 
[Xa,--- Xe) Yty---» Vea} is a blocking coalition, which completes the 
proof. 


COROLLARY 1. The dual D of a non-trivial Steiner triple system D* is not 
strong. 


Proof. Except for the cases v* = 1,3,7, which we have termed trivial, 
the Steiner triple systems have k* = 3 and r* > 4. Hence, in the dual, r = 3 
and k > 4. 


COROLLARY 2. Jf D* is the system of lines in the finite euclidean space 
EG(m, p") of m dimensions over the Galois field GF(p") for m > 2 and p" > 3, 
then the dual D is not strong. 


Proof. In D* we have v* = p™, BF = p™™-Y(1 + p*? +... 4 pr"), 
=p" r*® =1+p"+...+ p"™-", and \* = 1. Hence in the dual, r > 3 
and k > 4. 


CoroLLary 3. If D* is the system of lines in the finite projective space 
PG(m, p") of m dimensions over the Galois field GF (p") with m > 3 and p”" > 2, 
then the dual D is not strong. 


Proof. In D*, we have v* =1+p" +... + )™, 
_ (i+p'+...+¢p™)(i+p*+...4+ 9”) 
1+" 


‘It is easily verified that the only remaining cases are the triangle and the seven-point 
projective plane, which are strong, and the duals of complete n-gons, m > 4, also termed 
triangular association schemes below, which are not strong. 


b* 





R= 1+ p*, 

















116 A. J. HOFFMAN AND MOSES RICHARDSON 


re =1+p"+...+ p™-", and A* = 1. Hence, in D, we have k > 4, and 
r> 3. 


5. Some games with no equitable main simple solution. 


THEOREM 4. Jf D* is the system of hyperplanes in the finite euclidean m-space 
EG(m, p"), m > 2, then in the dual D there exists a blocking coalition with p" 
members. 


Proof. In EG(m, p"), consider any family of p”" parallel hyperplanes or 
(m — 1)-spaces, one through each point of a transversal line. Their union con- 
tains all the points of the space. In the dual D, these hyperplanes correspond 
to p” elements incident with all the blocks, no two of which elements occur 
together in any block. Therefore, these elements constitute a blocking coalition 
with p” members. 


CorROLLARY. The block design game D, dual to the system of hyperplanes of a 
finite euclidean m-space EG(m, p"), m > 2, is not strong and has no equitable 
main simple solution. 

Proof. The last conclusion follows at once from Lemma 3 of § 3. 


THEOREM 5. In the dual D of a non-trivial Steiner triple system D*, there 
exists a blocking coalition with k members. 


Proof. In D*, consider the set of r* triples containing a given element x, 
say. Delete any two of these triples, say (x, a, 6) and (x, c, d) where a, 6, c, and 
d are, of course, distinct elements. Then there exist triples (a, c, y) and (8, d, 2) 
with y # x, z # x in D*, since \* = 1 and since the trivial systems have 
been excluded. Replacing the two deleted triples by the latter two, we have 
a set of triples whose union contains all elements of D* and does not contain 
any set of all triples through any particular element. In the dual D, this 
corresponds to a set of elements incident with all blocks but not containing 
all elements of any particular block. This is a blocking coalition with k = r* 
elements. 


COROLLARY. The block design game D, dual to a non-trivial Steiner triple 
system D*, is not strong and has no equitable main simple solution. 


Proof. The latter conclusion follows at once from Lemma 3 of § 3. 


Remark 1. The system of all lines in m-dimensional finite projective space 
PG(m, 2) over the integers modulo 2, and the system of all lines in m-dimen- 
sional finite euclidean space EG(m, 3) over the integers modulo 3, are Steiner 
triple systems. Of course, not every Steiner triple system is of this type. 


Remark 2. The conclusion of Theorem 3 does not hold for the dual of a 
partially balanced design with some A; = 1 and some A, > 1. For instance, 
the game of Example 2 is not strong but the game of Example 6 is strong 
(see § 10, below). 





ere 











BLOCK DESIGN GAMES 117 


By a triangular association scheme (cf. 2) is meant an nm by n matrix in 
which: (a) the elements on the principal diagonal are left blank; (b) the 
n(n — 1)/2 positions above the principal diagonal are filled by the numbers 
1,2,...,(m — 1)/2; (c) the matrix is symmetric. If we take the players to 
be the numbers 1, 2, ... , m(m — 1)/2, and the minimal winning coalitions 
W,,..., W, to be the rows of the triangular association scheme, then it is 
easily seen that we have a block design game with v = n(n — 1)/2, 6 = n, 
k=n-—1, r = 2, and every two distinct minimal winning coalitions have 
one player in common. We shall term such a game a triangular game. 


THEOREM 6. A triangular game with n > 3 has a blocking coalition with 
<k — 1 members; hence it is not strong and has no equitable main simple solution. 


Proof. Let {x1} = W,C\ We. Since W,\U We has 2n — 3 members, and 
v > 2n — 3 for n > 3, there exists an x2 ¢ W,; U W2. Suppose, for example, 


that {xe} = W;\ Wy. Then we can choose arbitrarily x, € W, (i = 
5, 6, . . . , m), distinct or not. Obviously, the distinct members of the set 
{x1, X2, Xs, X6,.---, Xn} form a blocking coalition with <m — 2 = k — 1 mem- 


bers. The second assertion of the theorem follows from Lemma 3. 


Remark 3. In fact, it is easy to see that there exists a blocking coalition with 
{(2 + 1)/2] members, where [x] is the largest integer <x, but this stronger 
result does not seem to have any interesting game-theoretic implications. 


6. Even-dimensional finite projective games. [In (13), finite projective 
games PG(h, p") were defined as follows: the players are the points of the 
finite projective space PG(h, p") of dimension h > | over the Galois field 
GF(p"), and the minimal winning coalitions are the (s + 1)-spacesifh = 2s + 
1, and the s-spaces if h = 2s. As noted in (13), the odd-dimensional finite 
projective games are not strong and have no main simple solution since the 
s-spaces are blocking coalitions contained in the minimal winning coalitions 
(cf. Lemma 1, above). In (13), it is also proved that the plane games PG(2, p") 
are not strong except for p" = 2, but that all of them have equitable main 
simple solutions. We shall now round out this discussion by disposing of the 
games PG(2s, p") with s > 2. 

THEOREM 7. The games PG(2s, p"), s > 2, are not strong. 


Proof. Consider any (s + 1)-space P,,, in PG(2s, p"). Since any s-space 
intersects P,,,; in a space of dimension at least one, a set B will be a blocking 
coalition if it consists of points of P,,, such that B meets every line of P,,, 
but contains no s-space of P,,;. We show that such a set B exists. Introduce 
a homogeneous co-ordinate system (Xo, X1,...,%Xs41) into P.4; in the usual 
way by means of an (s + 1)-simplex of co-ordinates ¢,,.. 


Case 1. Suppose either p" # 2, or p" = 2 and s is even. Let B be the set of 
all points x of P,,, such that the number Z(x) of zero co-ordinates of point x 
satisfies 1 < Z(x) < s. 








118 A. J. HOFFMAN AND MOSES RICHARDSON 


We prove first that every line / of P,,; intersects B. Clearly, / meets the 
s-space x9 = 0 in at least one point x. If Z(x) # s + 1, it is the desired point 
of B. If Z(x) = s + 1, then let the remaining non-zero co-ordinate of x be 
x;y, t #0. Let y be a point of intersection of / with the s-space x, = 0. If 
Z(y) # s + 1, it is the desired point of B. If Z(y) = s + 1, then the point 
x + yisa point of / having Z(x + y) = s, and is therefore in B. 

We must still prove that B contains no s-space of P,,;. Let an equation of 
an arbitrary s-space of P,,,; be 


(8) Agro + Q3X, +... 4+ Ogi tXeg41 = 0. 
If at least one coefficient, say ao, is equal to zero, then the point (1, 0,0,.. . , 0) 
is a point of the s-space not in B. If all coefficients of (8) are different from 
zero, we consider two cases, p" = 2 or p" ¥ 2. If p" = 2, and s is even, then 
s + 2 is even, and, since all a; = 1 by hypothesis, we have 

s+1 

ym a; = 0mod 2; 

i=0 


hence (1, 1,..., 1) is a point of the s-space not in B. If p” # 2, let c ¥ 0, 1 
and consider the numbers 


(9) Qo +4; +... +4, 
and 
(10) Qo tay t+... + G51 + a. 


At least one of these is not zero, because if both were zero then subtraction 
would yield (c — 1)a, = 0 and hence a, = 0 contrary to the hypothesis that 
all a; ¥ 0. If (9) is not zero, then the point 


(1, a = ect... +4) 
Qs+1 


satisfies (8) but is not in B. If (10) is not zero, then the point 


(1, ee ee eee tri + 01) 


A541 


satisfies (8) but is not in B. 


Case 2. Suppose p” = 2, and s is odd, s > 3. Let B be the set of all points 
x of P,4,; with Z(x) ¥ 1,s +1. 

We prove first that every line / of P,,, intersects B. Let x be a point common 
to 1 and the s-space x» =0. If x ¥ (0,1,1,...,1), (0,1,0,...,0), 
(0,0,1,0,...,0),...,(©,0,...,0,1), then x € B. If x is one of these 
points, let y be a point common to / and the s-space y,; = 0, where the ith 
co-ordinate of x is 1, so that x # y. Suppose, for example, i = 1. If y ¥ (1, 0, 1, 

ioe ths GAG. - se BO 1,6,...,.9,...,HO....O1), thas € 
B. It is easily seen that in any of the remaining cases, Z(x + y) = 2, s, or 0. 
Hence x + y is a point of / belonging to B. 








ion 


hat 





BLOCK DESIGN GAMES 119 


It is still necessary to prove that B contains no s-space of P,,,;. Let (8) be 
again an equation of an arbitrary s-space of P,,,;. If at least one coefficient, 
say @o, is zero, then the point (1, 0,...,0) satisfies (8) but is not in B. If all 
a, ~ 0, then all a; = 1, and, since s is odd, the point (0, 1, 1,..., 1) satisfies 
(8) but is not in B. This completes the proof. 

Geometrically, in Case 1, B consists of all the points of the face-planes of 
dimensions s,..., 1 of the co-ordinate simplex ¢,,; excluding the vertices. 
In Case 2, B consists of all the points of P,,; excepting the vertices of ¢,,; and 
the points of the s-face-planes not lying on face-planes of lower dimension. 
It is not difficult to count the number of points in B and to see that this 
number is greater than the number of points in an s-space. But in the next 
theorem we shall show that this must be true for any blocking coalition in 
PG(2s, p"), s > 2; and hence, by Lemma 2, that there exists an equitable 
main simple solution. 

In PG(2s, p"), let a; be the number of points in an i-space, and let a,‘ be 
the number of i-spaces containing a given j-space. 


LemMA 4. If r < s, then a,_;" > 1 + ay. 
Proof. By an easy calculation, we get 


™ 2sn 
a’. = p + +g = } + p+ p™+ Oe or = a2,->. 


But r < s implies 2s — r > s,or 2s —r > s + 1. Hence ag,_, — a, > ayy; — 
a, = pit)", Since p” > 1, we have a2,_, — a, > 1. 





THEOREM 8. The games PG(2s, p"), s > 2, have equitable main simple solu- 
tions. 


Proof. By Lemma 2 it suffices to show that if B is any blocking coalition, 
then the number |B| of points in B is greater than a,. Suppose, contrarywise, 
that |B) < a,. 

If every line joining two points of B were contained in B, then B would be 
a t-space. If t > s, B could not be a blocking coalition since it would contain 
an s-space or minimal winning coalition. If ¢ < s, then there would be an s- 
space in PG(2s, p") not meeting B, contrary to the assumption that B is a 
blocking coalition. Therefore, there exists a line / with at least two points in 
B and at least one point x not in B. 

We now prove inductively that for each r < s — 1 there exists an r-space 
containing x but not intersecting B. For r = 0, the point x suffices. Suppose 
the assertion is correct for r << s — 1. By Lemma 4, there are more than 
1 + a, (r + 1)-spaces containing the given r-space; since only one of them can 
contain /, there are more than a, (r + 1)-spaces containing the given r-space 
but not containing /. One of these (r + 1)-spaces does not intersect B, since 
at most one of them can meet a given point of B; for, if two of them contained 
the same point of B, then this point would be in the intersection of these two 

















120 A. J. HOFFMAN AND MOSES RICHARDSON 


(r + 1)-spaces which is the given r-space, contradicting the induction hypo- 
thesis that this r-space does not intersect B. This completes the induction. 

In particular, there exists an (s — 1)-space containing x but not intersecting 
B. Every s-space containing this (s — 1)-space meets B in at least one point 
since B is a blocking coalition. But the s-space determined by / and the given 
(s — 1)-space meets B in at least two points. Therefore |B| > 1 + a,, contrary 
to the supposition that |B| < a,. This completes the proof. 


7. Affine resolvable games. In this section, we examine certain simple 
games formed from block designs but not using all the blocks of the design 
as minimal winning coalitions. 

A balanced design is termed affine resolvable if the 6 blocks can be divided 
into r classes of m blocks each, such that: 

(a) every one of the classes of » blocks contains a complete replication of 
the v elements; 

(b) any two blocks of different classes have the same number of elements 
in common. 
Then (cf. 1) we have b = mr,v = nk, b =v +r —1, and s,;, = |B, OB, = 
k?/v if B, and B, are in different classes. If we arbitrarily select one block from 
each class as a minimal winning coalition, we obtain a simple game, with 
|¥3"| = r = b/n, which we term an affine resolvable game. Not all these games 
formable from a given affine resolvable balanced design need have the same 
number of players. For example, an affine resolvable balanced design with 

= 12, 6 = 22, r = 11, k = 6, \ = 5, m = 2 is given (cf. 1) by the blocks 


B, = (1,3, 4, 5 d, 9, 11) Biz = (2, 6, 7, 8, 10, 12) 

B, = (2, 4, 5, », 10, 1) By, = (3, 7, 8, 9, 11, 12) 

B, = (3, 5, € 7. 11, 2) By, = (4, 8, 9, 10, 1, 12) 

B, = (4, 6,7 7,8, 1, 3) Bis = (5, 9, 10, ei 2, 12) 

B, = (5, 7, 8, 9, 2, 4) Big = (6, 10, 11, 1, 3, 12) 

B, = (6, 8, 9, 10, 3, 5) By = (7, 11, 3 12) 

B, = (7,9, 10, 11, 4, 6) By = (8, 1, 2, 3, 5, 12) 

B, = (8, 10, 11, 1, 5, 7) By = (9, 2, 34.6 12) 

B, = (9, 11, 1, 2, 6, 8) Bey = (10, 3, 4, 5, 7, 12) 

Bio = (10, 1, 2, 3, 7, 9) Bx, = (11, 4, 5, 6, 8, 12) 

By, = (11, 2, 3, 4, 8, 10) Bo. = (1, 5, 6, 7, 9, 12) 
where B, and By,3; (¢ = 1,..., 11) constitute the 7th class. One affine resolv- 
able game with eleven players has B,(i = 1,...,11) as minimal winning 
coalitions. Another affine resolvable game with twelve players has B,(j = 
12,..., 22) as minimal winning coalitions. In both cases |B; (\ B,| = k?/v = 


3 if ¢ # j. In the first case {1, 3, 4} is a blocking coalition; in the second case 
{12} is a blocking coalition. Many other affine resolvable games can be formed 
from the same design. 














gn 


ed 


ig 


se 





BLOCK DESIGN GAMES 121 


Another example of an affine resolvable balanced design is an EG(2, p"), 
where the classes of blocks are the parallel pencils of lines. Selecting one line 
from each parallel pencil, we have an affine resolvable game with |B, (\ B,| = 1 
if i # j. 

As an immediate consequence of Theorem 2 and Lemma 1, we have the 
following theorem. 


THEOREM 9. Jf an affine resolvable balanced design has k*/v > 1, then any 
affine resolvable game obtained from it as above is not strong and has no main 
simple solution. 


8. Interpretation in terms of linear graphs and network flows. Any 
simple game G can be represented as an even (or bipartite, or simple) graph, 
as follows. Let the two vertex sets be W" = {W,, W2,..., W,} and N = 
{1,2,...,v} and let W, € @" and j € N be joined by an arc if and only if 
jisa member of W,. Each vertex W, has degree |W,|, the number of members 
of W,. The many-valued mapping [': N — %", where ['(j) is the set of all 
minimal winning coalitions to which j belongs, is such that T-'W,(\ [-'W, # 
¢ for i # j, or, in other words, T'T-'W, = YW" for each W,. If G is a block 
design game then the degree of every vertex W, of Y” is k and the degree of 
every vertex j of N is r. To a blocking coalition of any simple game G in this 
representation corresponds a subset B of N such that TB = YW" but B D r=! 
W,forany W, € @". 

We can convert this graph theoretic representation into a network flow 
representation as follows. Join all vertices of V to an input vertex J, and all 
vertices of WW" to an output vertex U, as in Fig. 1, illustrating the dual of 
EG(2, 2) with v = 6, 6 = 4, k = 3, r = 2 (cf. Example 1, § 10). Putting 
capacities c,,; on the arcs as indicated in the figure, a blocking coalition cor- 
responds to a flow x,, yielding maximum output but with the restriction that 




















122 A. J. HOFFMAN AND MOSES RICHARDSON 


the flow shall not be different from 0 at any entire set of vertices of the form 
r-'W,, W, € ®". This can, in turn, be expressed as a linear programming 
problem: find x,, such that 


p> Xa — > Xpy = 0 
for each vertex 8 # I, U, and such that 
(1) > xw = max = 5b 
jc” 
subject to the constraints 
(2) O < Xi < C4y, 
with the additional restriction that 
(3) for each 7 € YW” there exists an i = i(j) € I'-'(j) such that 
Xn = p> Xi = 0. 


If such a flow exists, a blocking coalition is given by the set 


{i en 


If such a flow does not exist, the game is strong. By the methods of Goldman 
and Tucker (6), all extreme feasible vectors of the linear programme given by 
(1) and (2) can be determined, and then each W, € @W"(j = 1, 2,..., 5) can 
be examined to see if the additional restriction (3) is satisfied. For if any feasible 
vector is on a co-ordinate (m — r)-plane 





E xy >of. 


Xi HX =... = Xi, =0 
then so is some extreme feasible vector. 


Thus the results of the preceding sections are readily interpreted in terms 
of linear graphs or network flows, as desired. 


9. Miscellaneous remarks. Unsolved problems. In Luce (9), it is 
proved that a necessary and sufficient condition for a simple game to be h- 
unstable is that there exist an (4 + 1)-element winning coalition and that 
the intersection of all (k + 1)-element winning coalitions be empty. 


THEOREM 10. A block design game is h-stable for 1 <<h << k —1 and h- 
unstable fork -1<qh<v—1. 


Proof. There exists a winning coalition with 4 + 1 members if A > k — 1. 
Clearly, for h = k — 1, the intersection of all (h + 1)-element winning coali- 
tions is empty since r < b. As long as there remain two different elements to 
adjoin to the (4 + 1)-element sets to obtain (hk + 2)-element sets, induction 
shows that the intersection of all (4 + 1)-element winning coalitions is empty 
for h < v — 1. This completes the proof. 








for 


col 


fo 











BLOCK DESIGN GAMES 123 


We define an automorphism (collineation or perhaps cowineation) of a 
block design with incidence matrix A = ||a,,|| to be a permutation rz of the 
rows of A (elements or players) which carries columns of A (blocks) into 
columns of A. Let r¢ be the permutation of the columns induced by the per- 
mutation rg. We shall assume that A has neither duplicated columns nor duplica- 
ted rows. Let Gz be the group of all automorphisms rz and let Gg be the group 
of all we. 


LemMMA 5. If re induces mc then in the dual design the automorphism re 
induces Tr. 


Proof. Let rz carry the matrix A into the matrix B. Then 
by = Gr A).9 = Field 
for all 2, 7. Hence mz is induced by z¢. 


LEMMA 6. No two different row permutations rz # rp’ can induce the same 
column permutation. 


Proof. If so, then 


a; ral) = ie’ oD 


for all i, 7 and hence 


Os p(t). j = Gs’ p(i).4 


for all i, 7 contrary to hypothesis. 


THEOREM 11. The automorphism groups of dual designs are isomorphic (as 
groups, even though of different degrees). 


Proof. Obviously, the product of two row permutations induces the product 
of the induced column permutations. Hence, with our restriction of non- 
duplication of rows and columns of A, the homomorphism is one-to-one, and 
onto. 


The following unsolved problems seem to be difficult: 


1. How can one determine the automorphism group of a block design 
(without examining one by one each of the permutations of the symmetric 
group on v letters to see if it is an automorphism, although this might be feasible 
within limits with a computer)? This is solved for the desarguesian finite pro- 
jective spaces and the associated euclidean spaces. Further, what can be said 
of the transitivity of this group acting on elements and on blocks? 


2. What is the minimum number of members in a blocking coalition of a 
block design game? This is unsolved even for finite projective planes, except 


for PG(2, 3). 


3. Do the block design games, not covered by (13) or the corollaries of 
Theorem 2, above, possess main simple solutions? 








124 A. J. HOFFMAN AND MOSES RICHARDSON 


4. To determine all block design games given the parameters », 6, k, r such 
that bk = or. That is, to determine all »v X 5 matrices A with a,, = 0 or 1 such 
that the row sums are all equal to r, the column sums are all equal to k, and 
the elements s,, of S = A7A are all positive. 


5. If v = 6 and (hence) k = r, when do there exist permutation matrices 
P,Q such that PAQ = A? where A is the incidence matrix? When do there 
exist permutation matrices P, Q such that PAQ is a symmetric matrix? When 
is there a row permutation such that RA is symmetric? What are general 
criteria for permutation equivalence of matrices with elements equal to 0 or 
1? for general matrices? 


10. Examples. We collect in this section some concrete examples illustrating 
some of the preceding theorems. Other examples of block designs yielding 
simple games can be found in (2). 





Example 1. The finite euclidean plane EG(2, 2) has v* = 4, b* = 6, k* = 2, 
r* = 3, A* = 1 and (Fig. 2) incidence matrix 
a bcde f | a 2 
iikinia e 











4 f 3 


Fic. 2. 


In the dual game we have v = 6, 6 = 4, k = 3, r = 2 and the intersection of 
every pair of blocks has one element. The incidence matrix of the game is the 
transpose of the above. The sets {a,f}, {6, e}, {c,d} are blocking coalitions 
illustrating Theorem 1(a). By Lemma 3, there is no equitable main simple 
solution. But in this example, it is easy to give a direct proof that no main 
simple solution exists at all. For the linear system (6), (7) becomes, with 
obvious changes in notation: 


Xe +X + X- ] 
cm + Xa + x, = |] 
X» + X4q +x,;=1 
Xe +x,+x,=1 
\ Xe + x,;> 1 
X» + X, > 1 
Xe + Xa > EB, 
This implies x, = x; > 3, x» = x, > 3, x. = x4 > 4 so that a contradiction 


would result. 











ich 


: of 
the 
ons 
ple 
ain 


ith 


ion 





BLOCK DESIGN GAMES 125 


Example 2. The symmetric partially balanced block design designated as 
R1 in (2) has v* = 5* = 6, k* = r* = 3, A, = 2, Ap = 1, = 1, me = 4 and 
incidence matrix 


44/101100 
5|0 10110 
6|0 01011 


The dual has v = 6 = 6, k = r = 3, every pair of blocks has an intersection 
of one or two elements, and the incidence matrix is the transpose of the above. 
The set {a, 6, c} is a blocking coalition. This illustrates Theorem 1. 

i g 


Example 3. The symmetric regular group divisible partially balanced block 
design designated as R2 in (2) has v* = b* = 6, r* = k* = 4, Ay = 3, As = 2, 
n, = 2, m2 = 3 and incidence matrix 





1}/10011 1 
2/1 11100 
se 46 4 8 8 
4} 1 11010 
5|'0 0111 1 
6}/1 11001 


Here s,;; = 2 or 3 and indeed {1, 2} is a blocking coalition contained in the 
minimal winning coalition a in accordance with Corollary (d) of Theorem 2. 
The dual has v = 6 = 6, k = r = 4, every pair of blocks has an intersection 
of 2 or 3 elements, and its incidence matrix is the transpose of the above. In 
accordance with Theorem 2, there is a blocking coalition contained (properly) 
in a minimal winning coalition, namely, {a, 6, c}, which is a proper subset of 
blocks 2 or 4, or {a, d} which is a proper subset of blocks 1 or 2. In fact the 
dual is isomorphic to the original design as can be seen by performing the 


permutations 
een sail on) 
152634) *™ acedfb 


on the rows and columns respectively. 


Example 4. Let EG(2, 3) be the euclidean plane over the integers modulo 3, 
with v* = 9, b* = 12, k* = 3, r* = 4, A* = 1. The projective plane PG(2, 3) 
has the cyclic representation 











126 


o-oo 


9 


- doe 


10 





A. J. HOFFMAN AND MOSES RICHARDSON 


12 


ene 7: 2 6 8282 BB 
5 67 8 9 10 11 12 0O 
7Teenante se : 8 
we -_ «Se 2 B 


Taking the line {12, 0, 2, 8} as line at infinity, and deleting its points, we 
derive the incidence matrix of EG(2, 3): 


In the dual, we have v = 12,8 =9, R=4,7r= 


— 


“ER ri OO | 


cooroococo.}Fr 


a 
= 


: 28 823.8 8 ¢ 
a ae oo 
- ee oe oe oe 
7a we Ss ae eS ew 8 
eet 2 2. 3°¢ 9 
00001011 0 
7? @e et eS ft 


3, every pair of blocks 


intersects in one element, and the incidence matrix is the transpose of the 
above. Note that EG(2, 3) is also the simplest non-trivial Steiner triple system. 
The process of Theorem 5 applied to {a, 6, f,1}, choosing f = [1, 5, 6], 7 = 
[1, 7, 11], yields e = [4, 5, 7] and k = [6, 10, 11] as substitutes, producing the 
blocking coalition {a, 6, e, k}. But this is not minimal since {a, e, k} is also a 
blocking coalition. 


Example 5. One of the Steiner triple systems with v* = 13 (cf. 8) has the 


incidence matrix 


jkimnopqrstuvwxyz 





abcdef ghi 
1/111111000 
21/100000111 
31100000000 
4/010000100 
5|010000010 
6'001000100 
71/001000010 
8|/000100001 
9'000100000 
10}/000010001 
11}000010000 
1I2}000001000 
131/000001000 


000000000000 00000 
L1000000000000000 


_ 
— 
_ 
~ 
—_— 
os 
— 
_ 
-_ 
— 
~ 
— 
— 
a 
— 
= 
— 
a 
— 
_ 
—_ 
— 
— 
-~ 
— 
_ 
—_ 
= 
— 

















BLOCK DESIGN GAMES 127 


The dual has v = 26, 6 = 13, k = 6, r = 3, every pair of blocks intersects in 
one element, the incidence matrix is the transpose of the above, and {a, m, s, w, 


f\ is a blocking coalition. 


Example 6. The design 79 of (2) has »* = 10, 5* = 6, k* = 5, r* = 3,2, = 
1, Xo = 2, m, = 6, m2 = 3, and incidence matrix 





ia Be @e f 
1/1 10010 
21;00101 1 
3\100101 
4/0 11100 
sso 10101 
6|0 011410 
7;101001 
8i/111000 
9/10011 0 
0}/010041 1 


The dual has v = 6, 6 = 10, k = 3, r = 5, and the incidence matrix is the 
transpose of the above. This game is strong, and has a main simple vector 


a, = 1/3 (¢ = 1,2,...,6). 


Example 7. The design SR14 of (2) is a symmetric semi-regular group 
divisible design with v = 6 = 9, k= r = 6, Ay = 3, Ae = 4, and incidence 
matrix: 





la be def ghi 
Lone 2 2S. SE-B 
poe Bes 8.8 2 8 eS 
tee ee eS a ee ee a 
4'11%100041ii421@ii421 
Bie 2 ee Oe 2 2. E 
Stk). @ he? @ 2 I 
7\0 0013131111 
8 Se 8. @.2.2.8 23 32 
lt ea a a a a 


Here all s;; > A: = 3, and indeed {1, 2, 3} is a blocking coalition contained in 
the minimal winning coalition a, as promised by Corollary (f) of Theorem 2 


~~ 











128 A. J. HOFFMAN AND MOSES RICHARDSON 


REFERENCES 


1. R. C. Bose, A note on the resolvability of balanced incomplete block designs, Sankhyd, 6 
(1942), 105-110. 

2. R. C. Bose, W. H. Clatworthy, S. S. Shrikhande, Tables of partially balanced designs with 
two associate classes, Tech. Bull. No. 107, North Carolina Agricultural Experiment 
Station (1954). 

3. R. C. Bose and K. R. Nair, Partially balanced incomplete block designs, Sankhya, 4 (1939), 
337-372. 

4. R. D. Carmichael, Introduction to the theory of groups of finite order (Ginn, 1937). 

5. W. S. Connor, Some relations among the blocks of symmetrical group divisible designs, Ann 
Math. Stat., 23 (1952), 602-609. 

6. A. J. Goldman, A. W. Tucker, Theory of linear programming. In H. W. Kuhn and A. W 
Tucker, Linear inequalities and related systems (Princeton, 1956). 

7. H. M. Gurk and J. R. Isbell, Simple solutions. In A. W. Tucker and R. D. Luce, Con- 
tributions to the theory of games IV (Princeton, 1959). 

8. M. Hall, Jr., A survey of combinatorial analysis. In 1. Kaplansky, E. Hewitt, M. Hall, 
Jr., and R. Fortet, Some aspects of analysis and probability (Wiley, 1958). 

9. R. D. Luce, A definition of k-stability for n-person games, Ann. Math., 59 (1954), 357-366 

10. E. H. Moore, Tactical memoranda, Amer. J. Math., 18 (1896), 264-303. 

11. E. Netto, Lehrbuch der Combinatorik (Chelsea reprint of 1927 edition). 

12. J. von Neumann and O. Morgenstern, Theory of games and economic behavior (2nd ed.; 
Princeton, 1947). 

13. M. Richardson, On finite projective games, Proc. Amer. Math. Soc., 7 (1956), 458-465. 

14. L. S. Shapley, Lectures on n-person games (Princeton University notes, unpublished). 

15. S. S. Shrikhande, On the dual of some balanced incomplete block designs, Biometrica, 8 
(1952), 66-72. 


General Electric Company 
and 
Brooklyn College and Princeton University 


| 





ed: 





SIMPLE ALGEBRAS OF TYPE (1, 1) ARE ASSOCIATIVE 
ERWIN KLEINFELD 


1. Introduction. In the classification of almost alternative algebras relative 
quasiequivalence by Albert two new classes of algebras of type (7, 4) were 
introduced, namely those of type (1, 1) and (—1, 0) (1, equations (34), (35), 
and Theorem 6). Since rings of type (1, 1) and (—1, 0) are anti-isomorphic 
it suffices to consider those of type (1, 1). They may be defined as rings 
satisfying 


(1) B(x, y, 2) = (x, y, 2) — (x, 2, y) = 0, 
and 
(2) A (x, y, 2) = (x, y, 2) + (y, 2, x) + (2, x, y) = 0, 


for all elements x, y, and z of the ring, where the associator (a, 6, c) is given 
by (a, 6, c) = (ab)c — a(bc). 

Actually the identity 
(2’) (x, x,x) = 0, 


together with (1) implies (2) whenever the characteristic of the ring is dif- 
ferent from 2. This may readily be verified by linearizing (2’). Consequently 
we may use (1) and (2’) as the defining relations for a ring of type (1, 1). 

Additional results on rings of type (1, 1) were obtained by Kokoris (3; 4) 
and the author (2). The main result of the present paper, which is stated in 
the title, draws upon these results. Let R be a ring of type (1, 1), u any element 
of R of the form u = (x, y, x), and C the right ideal of R generated by u. 
Then uC = 0 = Cu (Theorem 2). This turns out to be the key result in the 
structure theory for it assures the existence of an abundance of right ideals 
even under the assumption of simplicity (Theorems 6 and 8). In contrast to 
this, if R is also not associative then it has no proper left ideals (Theorem 4). 
Every minimal right ideal A of R has the property A? = 0 (Theorem 5). 
With the additional hypothesis of chain conditions on right ideals no maximal 
right ideal of R can be nil and the union of the minimal right ideals of R is 
contained in the intersection of the maximal right ideals (Theorem 8). By 
assuming either that R has an idempotent or that R is a finite dimensional 
algebra one can utilize the information about the right ideals of R in order to 
reach a contradiction. In fact even primitive rings and hence semi-simple 
rings of type (1, 1) turn out to be associative [Theorem 11 and its Corollaries]. 
The characteristic of R is assumed to be different from 2 and 3, and in §4 
different from 5 as well. 


Received June 9, 1959. The research for this paper was supported in part by a grant from 
the Office of Ordnance Research to Ohio State University. 


129 














130 ERWIN KLEINFELD 


We also consider the more general question of rings of type (y, 6). When 
y ~ 1, —1 it turns out that a simple ring that is not associative, has no 
proper left or right ideals, and therefore the techniques developed for rings of 
type (1, 1) are not applicable. 


2. Identities. Fundamental to all our results on rings of type (1, 1) is 
Theorem 2, already mentioned in the Introduction, which permits the con- 
struction of right ideals. This result must be obtained through complicated 
computation. It is a more sophisticated version of the identity (x, y, x)? = 0, 
which constituted the main result of (2). We shall save considerable time and 
effort by recalling the following identities that hold for all elements w, x, y, z 
of a ring of type (1, 1). Except for (10), which is a specialization of (1) and 
(2), these identities may easily be located in (2). The commutator (x, y) is 
defined by (x, y) = xy — yx. 


(3) (x, (x, y, 2)) 0, 


(4) C(w, x, y, 2) = (w, (x, y,2)) + (x, (w, y, z)) = 0, 


(5) D(x, y, 2) = (x, yz) — y(x, 2) — (x, y)z + (x, y, 2) = 0, 
(6) F(w, x, y, 2) = (wx, y,2) — (w, xy, 2) + (w, x, yz) — w(x, y, 2) 

— (w, x, y)z = 0, 
(7) H(w,x,y,2) = (w,(w, x, y)z) — (w, (w, x, z)y) = 0, 


(8) ((x, y, s),x,x) = 0, 
(9) ((x, y, x), y, x) = ((x, y, x), y) = ((x, y, x), yx) = 0, 
(10) —(y, x, x) = 2(x, y, x) = 2(x, x, y). 
In addition to these identities we shall also make use of a result of Kokoris 
(4), that every subring of a ring of type (1, 1) that is generated by a single 
element is associative. 

Our first objective is to establish the following generalization of (3). 


THEOREM |. Let v be any element of the form v = (w, x, y), where w, x, y are 


elements of a ring R of type (1, 1). Then the right ideal D generated by v has the 
property that (w, D) = 0. 


Proof. Let s = (y,x,x) and u = (x, y,x). Then —s = 2u, as a result of 
(10). Since (u, yx) = 0 is implied by (9), we know that (s, yx) = 0. Also 


0 = Cly, yx, x, x) = (y, (yx, x, x)) + (yx, 5s) = (y, (yx, x, x)). 
But 


0 = Fi(y, x, x, x) 


(yx, x,x) — (y, x*, x) + (y, x, x”) — y(x, x, x) — (y, x, x)x 


(yx, x,x) — (y, x, x)x, 








ris 





ALGEBRAS OF TYPE (1, 1) 131 


asa result of (1). Since (y,(yx, x, x)) = 0, we must also have (y,(y, x, x)x) = 0. 
Replacing x by x + z in this last identity and utilizing (1) and (7), it must 
follow that 3(y,(y, x, x)z) + 3(y,(y, z, z)x) = 0. Now replacing x by —x in 
this last identity and adding one obtains 6(y,(y, x, x)z) = 0. Because of our 
assumption on the characteristic of R we may divide by 6, so that (y,(y, x, x)z) 
= 0. At this point replace x by x +w. Then utilization of (1) results in 
(y,(y, w, x)z) = 0. In summary, we have shown that 


(11) (w, (w, x, y)p) = 0. 
From here on we consider elements of the form 


(w, x, y)R,R,R.,...R 


in 

where R, is the mapping a — ak. Our inductive assumption will be that w 
commutes with all such elements for a given m and we shall attempt to prove 
this for m + 1. Incidentally (11) suffices to start off the induction. In case 
n = 2 we merely leave off 


T = R,,R,,...R 


2n+1° 
Consider therefore ¢ = (w, [(w,x,y)p-q]7T). In attempting to show that 
t = 0, the first step consists of establishing that the value of ¢ is unchanged 
under all permutations on x, y, p, g. That the interchange of x and y does not 
alter the value of ¢ is a consequence of (1). Starting with 
0 = F(w,x, y, p) = (wx, y, p) — (w, xy, p) + (w, x, yp) — w(x, y, p) — 

(w, x, y)p, 
we apply the mapping R,7 to this equation and commute the result with w. 
Because of the induction hypothesis we have 

(w, (w, xy, p)R,T) = 0 = (w, (w, x, yp) R,T). 
Therefore 
(w, (wx, y, p)R,T) — (w, (w(x, y, p)|R,T) — (w, [(w, x, y)p-q]T) = 0. 

In the first two terms of this last identity one may interchange y and p with- 
out changing their values, hence this must be true of the third term. But 


that term is —?. Finally the induction hypothesis implies (w, [(w, x, y)-pq]7’) 
= 0, so that 


t = (w, [(w, x, y)p-q]T) = (w, ((w, x, y), p, g)T). 


But in the last term p and g may be permuted without change in value, so 
that this must also be true of ¢. This suffices to demonstrate that every permu- 
tation of x, y, p, g leaves ¢t unchanged. Suppose now that 


= (w, [(w, x, x)x-x]T). 


Because of (8) and (10), = (w, [(w, x, x)-x*]T), and the latter is zero as a 
result of the induction hypothesis. Therefore ¢’ = 0. In ¢’ replace x by x + p 














132 ERWIN KLEINFELD 


and also by x — p and add the two expressions. Now, utilizing the fact that 
every permutation of x, y, p, g in ¢ does not alter the value of ¢, we see that 
12(w, [(w, x, x)p-p|T) = 0, so that (w, [(w, x, x)p-p]T) = 0. By replacing x 
by x + y in the last identity and then replacing p by p + q it becomes clear 
that (w, [(w, x, y)p-¢]T) = 0. This of course completes the induction. How- 
ever, (w, D) consists of sums of elements that are of the same type as ¢, but 
where zn is arbitrary. Consequently (w, D) = 0. This completes the proof of 
the theorem. 
At this point we are ready to prove the basic 


THEOREM 2. Let u be any element of the form u = (x,y, x), where x and y 
are elements of a ring R of type (1,1). Then the right ideal C generated by u has 
the property that uC = 0 = Cu. 


Proof. Let c be an arbitrary element of C. Then Theorem 1 implies that 
(c, x) = 0. Because of (10), (y, x, x) = —2(x, y,x) = —2u, so that the right 
ideal generated by (y, x, x) must also be C. A second application of Theorem 1 
yields that (c, y) = 0. Suppose that 

r= R,R,,...R,,; 

Then as a consequence of Theorem 1 it follows that ((r, a, 6)7,r) = 0. If we 
replace r by r + s in this last identity it becomes clear that 
(12) ((s, a, 6)T, r) = —((r, a, 6)T, s). 
Suppose then that we set r = yx, s = y, and a = 6 = x in (12). Then we 
obtain ((y, x, x)T, yx) = —((yx, x, x)T, y). However, it follows from 

0 = Fly, x, x, x) 
(yx, x,x) — (y,x?,x) + (y, x, x?) —- y(x, x, x) — (y, x, x)x 
(yx, x, s)=— (y, x, x)x, 


that [(y, x, x)x]T = (yx, x, x)T. Since [(y, x, x)x]T is an element of C and 
(C, y) = 0, it must be that ((xy, x, x)7T, y) = 0. But then it follows from 
above that ((y, x,x)7, yx) = 0. In other words (c, yx) = 0, because every 
element of C may be written as a sum of elements of the form (y, x, x) 7. We 
have already established that (c, y) = (c,x) = 0, so that 


0 = Dic, y, x) = (c, yx) — yc, x) — (c, y)x + (c, y, x) = (c, y, x). 
But then (c, x, y) = 0, as a result of (1). Expanding 

0 = D(c, x,y) = (c, xy) — x(c, y) — (c, x)y + (c, x, ¥) = (¢, xy), 
it also follows that (c, xy) = 0. In summary, we have shown that 
(13) (c, xy) = (c, yx) = (c,x) = (c, y) = (c, x, ¥) = (c, y, x) = 0. 


Also setting r = u,s = y, anda = 6 = x in (12) we find that ((y, x, x)7T, u) = 








-_* 


7 wa oe 





ALGEBRAS OF TYPE (1, 1) 133 


—((u, x, x)T, y). But (u, x, x) = 0, as aresult of (8), sothat ((v,x,x)7,u) = 0. 
From this one may conclude as before that 


(14) (c,u) = 0. 


Since u = (xy)x — x(yx) it follows from (14) that —(c, xy-x) + (c,x-yx) = 0. 
But then 
0 = —D(c, xy, x) + D(c, x, yx) 
—(c, xy-x) + xy-(c,x) + (c, xy)x — (c, xy, x) 
+ (c,x-yx) — x(c, yx) — (c, x)-yx + (c, x, yx) 
(c, x, yx) — (c, xy, x), 


as a result of (13). But then the last identity may be used in 
0 = F(c,x, y, x) = (cx, y, x) — (c, xy, x) + (c, x, yx) — c(x, y, x) — (c, x, y)x, 


together with (13) and the observation that cx is an element of C, to establish 
—c(x, y, x) = 0. This implies that cu = 0. But then as a result of (14) we also 
must have uc = 0. This argument holds for every element c of C, so that 
uC = 0 = Cu. This completes the proof of the theorem. 


3. The structure of left and right ideals. In this section R will be 
assumed to be a simple ring of type (1, 1), of characteristic different from 2 and 
3, that is, not associative. In this connection simple means that the only two- 
sided ideals of R are either R or 0. This hypothesis on R may of course lead to 
a contradiction, in which case we would be justified in concluding that simple 
rings of type (1, 1) are associative. Indeed we obtain this result in §4 with the 
added assumption that R is a finite dimensional algebra. In the present section 
we shall adhere to the more general situation. 


THEOREM 3. Rings of type (1, 1) that have no proper right ideals are associative. 


Proof. Form u = (x, y, x), for arbitrary elements x and y of R. Let C be the 
right ideal generated by u. Then either C = 0, in which case u = 0, or C = R. 
In the latter case we may make use of Theorem 2 in order to obtain that 
uR = 0 = Ru. The set of all elements g of R with the property that gR = 0 
= Rq may be verified to form a two-sided ideal of R. Since R is simple, either 
all such g are zero, or R? = 0. In the latter instance R would be associative, 
contrary to assumption. Therefore in all cases u = (x, y,x) = 0. But then 
(1) and (2) may be employed to establish that (y, x, x) = 0. Replacing x 
by x + z in this iast identity we are forced to concludeth at R is associative, 
a contradiction. The contradiction is the result of the assumption that R 
was not associative. This concludes the proof of the theorem. 

The situation on left ideals is just the reverse. 


THEOREM 4. Simple rings of type (1, 1) that are not associative have no proper 
left ideals. 














134 ERWIN KLEINFELD 


Proof. Let B be a proper left ideal of R. An element s of B will be defined to 
be special (relative to B) in case sR is always contained in B. It is easy to 
verify that the set S of special elements is closed under subtraction. It turns 
out we can even show that the special elements form a two-sided ideal of R. 
Select arbitrary elements x, y, z in R, a, 6 in B, and s in S. Then (x, y, 5) 
= (xy)b — x(yb) is an element of the left ideal B. Then because of (1), (x, 6, y) 
must also be in B. But then (0, x, y) is also in B, as a result of (2). Since s 
is in B and B is a left ideal of R it follows that xs is in B. On the other hand 
(xs)y = (x, s, y) + x(sy). But it follows from the definition of S that sy must 
be in B, so that x(sy) is also in B. Since s is in B it follows from previous dis- 
cussion that (x, s, y) is in B. As a result both xs and (xs)y are in B, hence xs 
is special. Then we know that S is a left ideal of R. In a similar manner sx 
is in B because of the definition of S, while (sx)y = (s, x, y) + s(xy) is also 
in B. Then S is a two-sided ideal of R. Since S is contained in B and B is a 
proper left ideal of R, we must have S # R. But R is simple, so that S = 0. 
This is very useful information since we can show that various elements of R 
must be special. Finally we shall be able to deduce that B = 0, which is the 
desired contradiction. To begin with (a, 5, x) is clearly in B. Also 


0 = F(a, b,x, y) = (ab, x,y) — (a, bx, y) + (a, b, xy) — a(b, x, x) — (a, b, x)y, 


so that —(a, 5, x)y is also in B. Hence (a, 6, x) is special and consequently 
(a, 6, x) = 0. In other words (B, B, R) = 0. Then from (1) and (2) it follows 
that also (B, R, B) = 0 and (R, B, B) = 0. For this reason 

0 = F(x, y, b, b) 
(xy, 6, b) — (x, yb, b) + (x, y, 67) — x(y, 6, b) — (x, y, b)b 
(x, y, 67) — (x, y, 5)d, 


ll 


Il 


so that (x, y, 6?) = (x, y, 6)b. On the other hand 


F(x, b, b, y) + F(x, 6, y, 5) 
(xb, b, y) — (x, b?, y) + (x, 6b, by) — x(b, b, y) — (x, b, b)y 

+ (xb, y, b) — (x, by, b) + (x, b, yb) — x(b, y, b) — (x, b, y)b 
= —(x, 5, y) + (x, b, by) — (x, by, b) — (x, b, y)bd. 


0 


ll 


But (x, b, by) — (x, by, b) = 0 because of (1), while (x, 5?, y) = (x, y, 6), and 
(x, 6, y)b = (x, y, 6)b, for the same reason. But then (x, y, 6?) = — (x, y, 6)d, 
whereas we have already noted that (x, y, 6?) = (x, y, 6)b. Therefore (x, y, 6”) 
= (x, y, 5) = 0. The nucleus N of R is defined as the set of all elements n 
in R such that (n, R, R) = (R,n, R) = (R, R, n) = 0. Because of (1) and 
(2), 6? must be in NV. Let be an arbitrary element of NV. Then 


0 = C(n, x, y, 2) = (m, (x, y, z)) + (x, (m, y, 2)) = (nm, (x, y, 2)), 


so that (m, (R, R, R)) = 0. Similarly (12) implies that (m, (R, R, R)R) = 0. 
However, the set of finite sums of elements that are of the form (R, R, R) 





R) 





ALGEBRAS OF TYPE (1, 1) 135 


and of the form (R, R, R)R form a two-sided ideal in an arbitrary ring R. 
If that ring is also simple and not associative then of course it follows that 
this ideal must be the whole ring. The conclusion we can draw in the present 
situation is that (m, R) = 0, so that (6?, R) = 0. Since 8? is an element of B 
and now also 6*x = xb* is an element of B we conclude that 5? must be special, 
hence 5? = 0. If we replace 6 by a + 6 then also ab + ba = 0. Since (x, y, d) 
is an element of B it must then follow that (x, y, ba = —a(x, y, 6). Also 


0 = C(a, x, y, d) 


(a, (x, y, b)) + (x, (a, y, b)) = (a, (x, y, d)) 
= a(x, y, b) — (x, y, d)a. 


But then a(x, y, b) = (x, y, b)a = 0. This may be used in the expansion of 


0 = F(x, y, b,a) = (xy, b,a) — (x, yb, a) + (x, y, ba) — x(y, b,a) — 
(x, y, d)a, 


to;show that (x, y, ba) = 0. As before this implies da is in the nucleus, and 
hence commutes with every element of R. Consequently one can show that 
ba is special and so ba = 0. In other words B* = 0. Form J = B + BR. Then 
(bx)y = (b,x, y) + b(xy). We have already noted that (4, x, y) is an element 
of B, and hence of J. Then J must be a right ideal of R. Similarly y(dx) = 
—(y, b,x) + (yb)x, where —(y, b,x) is an element of B and (yd)x is an 
element of BR. Thus y(dx) is an element of J. This suffices to show that J 
is a two-sided ideal of R. If J = 0, then B = 0, contrary to assumption. If 
on the other hand J = R, then BJ = B(B + BR) = 0, so that BR = 0. But 
then J = B, and so R = B, contrary to assumption. In either case we have 
reached a cuntradiction. Consequently there can exist no proper left ideal B 
in R. This completes the proof of the theorem. 

If Theorem 4 were true for right ideals also, then of course this would prove 
simple rings of type (1, 1) to be associative, which is the strongest possible 
result one could hope to get. Arguments of the type used in the proof of 
Theorem 4 seem inadequate for this purpose. With some effort one can obtain 
a construction for a right ideal of R, properly contained within any non-zero 
right ideal A of R. Except when A is minimal, this construction does not seem 
to be especially enlightening, and since we can get some information on 
minimal right ideals more directly, we shall not go into this construction. 


THEOREM 5. Jf R is a simple ring of type (1, 1) that is not associative, and if 
A is a minimal right ideal of R, then A? = 0. 


Proof. Let t = (a,x, x), where a is an arbitrary element of A and x an 
arbitrary element of R. Let C denote the right ideal generated by ¢. Since 
is an element of A, C must also be contained in A. From the minimality of A 
as a right ideal it follows that either C = 0, or that C = A. If C = 0 then 
t = 0. On the other hand Theorem 2 implies that Ct = 0 = (tC, so that if 














136 ERWIN KLEINFELD 


C = A, then At = 0 = #A. In either case we may conclude that At = 0 = ¢A. 
Replacing x by x + y in this last identity and using (1) we conclude that 
A (a, x, y) = 0 = (a,x, y)A, where y is an arbitrary element of R. In other 
words A(A, R, R) = 0 = (A, R, R)A. Let P be the set of all elements p in 
A with the property that Ap = A(pR) = 0. P is obviously closed under 
subtraction. With the last identity we shall be able to show that P is in fact 
a right ideal of R. Select arbitrary elements p in P, a, 6, d in A, and x, y, z 
in R. Since p is in A, so is px. From the definition of P it follows that a(px) = 
0. Also a(px-y) = a(p, x, y) + a(p-xy) = a(p, x, y). Since 0 = A(A, R, R), 
we have a(p, x, y) = 0, and thereby a(px-y) = 0. Consequently P is a right 
ideal of R and contained in A. Let us assume that A? # 0. Then P # A. 
From the minimality of A as a right ideal it follows that P = 0. Our next 
objective will be to show that (A, A, R) is contained in P. We have seen pre- 
viously that (A, 4, R) is contained in A and also that A(A, R, R) = 0. 
But then 


0 = aF(b, d, x, y) 
= a(bd, x, y) — a(b, dx, y) + a(b, d, xy) — a[b(d, x, y)| —a[(d, d, x)y] 
—al[(d, d, x)y], 


and so (A, A, R) is contained in P. But then (A, A, R) = 0. As a consequence 
of the last identity 


0 = F(a, b,x, y) = (ab, x, y) — (a, bx, y) + (a, b, xy) — a(b, x, y) 


— (a, b,x)y = (ab, x, y). 


In other words (A?, R, R) = 0. As in the proof of Theorem 4 we can now use 
0 = C(ab, x, y, z) to show that (A?, (R, R, R)) = 0, and (12) to show that 
(A, (R, R, R)R) = 0. Since the two-sided ideal generated by all associators 
must be all of R, we have demonstrated that (A?, R) = 0. But then x(ad) = 
(ab)x = (a, b, x) + a(bx) = a(bx), proving that A? is a two-sided ideal of R. 
Since we have assumed that A? ~ 0, it must be the case that A? = R. But 
A? is contained in A, so that A = R. This contradicts the assumption that A 
is a minimal right ideal of R. Consequently A* = 0. This completes the proof 
of the theorem. 


The next result plays a very important part in the Main Theorem. 


THEOREM 6. Let R be a simple ring of type (1, 1), with chain conditions on 
right ideals, that is not associative. Then the number of maximal right ideals and 
the number of minimal right ideals are both greater than one. 


Proof. The existence of at least one maximal right ideal and of at least one 
minimal right ideal are insured by the chain conditions and Theorem 3. 
Suppose that R has only one maximal right ideal. Call it B. Consider an 
arbitrary element u of the form uw = (y,x,x), and let C be the right ideal 
generated by u. Then, because of Theorem 2, uC = 0 = Cu. If u is not an 





nce 











ALGEBRAS OF TYPE (1, 1) 137 


element of B, then C is not contained in B. But B is the unique maximal right 
ideal of R and there is only one right ideal not contained in it, namely 2. 
Thence uR = 0 = Ru. But the absolute divisors of zero of R form a two-sided 
ideal of R, which cannot be all of R. Consequently u = 0, contrary to assump- 
tion. Thus w is an element of B for all x and y in R. Replace x by x + gin u. 


R 


B B 


E 


. 


0 


Fic. 1 





Then as a result of (1) we note that all associators of R must be contained in 
B and thereby also all right multiples of associators. But then B = R, a 
contradiction since B was assumed to be a maximal right ideal. Because of 
this contradiction B cannot be the unique maximal right ideal and consequently 
R must have at least two maximal right ideals. This completes the first half 
of the theorem. Suppose now that A is the only minimal right ideal of R. 
Define u as before, as well as C, so that uC = 0 = Cu. We see at once that 
either C contains A or C must be zero. In the latter case u = 0. In the former 
uA = 0 = Au. But then we may conclude that uA = 0 = Au in either case. 
Replacing x by x + zin the last identity and using (1) we obtain (R, R, R)A = 
0 = A(R, R, R). Let w, x, y, z be arbitrary elements of R and a an arbitrary 
element of A. Then 


0 = F(w, x, y, z)a 


(wx, y, z)a — (w, xy, z)a + (w, x, yz)a — (w(x, y, z)la — [(w, x, y)z]a 


— [w(x, y, z)]a — [(w, x, y)z]a. 











138 ERWIN KLEINFELD 


Therefore in g = [w(x, y, z)ja all permutations of x, y, and z do not alter the 
value of g. But 0 = (x, y, z) + (y, 2, x) + (2, x, y), because of (2). Theréfore 
3q = 0, so that g = 0. But then [(w, x, y)z]a = 0. In summary, we have 
shown that (R, R, R)A = 0, and that [(R, R, R)RJA = 0. As before we can 
deduce from this RA = 0, so that A = 0. However, A was chosen to bea 
minimal right ideal and this is clearly a contradiction. Hence R has at least 
two minimal right ideals. This concludes the proof of the theorem. 

The last result seems to leave open the possibility that a maximal right 
ideal might be a minimal right ideal. However we shall see later that this 
cannot happen. In fact every minimal right ideal will be seen to be contained 
in every maximal right ideal, and any such pair are always separated by at 
least one intermediate right ideal (Theorem 8). 


THEOREM 7. Let R be a simple ring of type (i, 1). Suppose A and B are right 
ideals of R such that A? = 0, A + B = R, and B # R. Then R 1s associative. 


Proof. Since A* = 0, we see that (A, A, R) = 0. But then (A, R, A) = 0 
and (R, A, A) = 0 asa result of (1) and (2). Also (B, R, R) is contained in B 
and therefore so is (R, B, B), as a result of (2). Expanding 


(R, R, R) (A + B,A + B,A + B) 


= (A, A, A) + (B, B, B) + (A, B, B) + (B, A, B) + (B, B, A) 
+ (B, A, A) + (A, B, A) + (A, A, B), 


it becomes evident that (R, R, R) is contained in B. But then also (R, R, R)R 
is contained in B. Since B # R, the only two-sided ideal of R that is contained 
in B is zero. But we have just seen that the ideal generated by all associators 
is contained in B. Therefore R must be associative. This completes the proof 
of the theorem. 


CoroLiary. Let R be a simple ring of type (1, 1) that is not associative. If A 
is a minimal right ideal of R and B a maximal right ideal of R then A is contained 
in B. 


Proof. Suppose that A is not contained in B. Since B is a maximal right 
ideal of R, A + B = R. Since A is a minimal right ideal of R, we may use 
Theorem 5 to obtain A? = 0. But then the hypothesis of Theorem 7 is satis- 
fied, so that R must be associative. From this contradiction one deduces that 
A is contained in B. This completes the proof of the corollary. 


THEOREM 8. Let R be a simple ring of type (1, 1) with unit element and chain 
conditions on right ideals that is not associative. Let B be any maximal right 
ideal of R, A any minimal right ideal of R, D the intersection of all the maximal 
right ideals of R, and E the union of all the minimal right ideals of R. Then B 1s 
is not nil and 


OCACESCDCBCR. 














ALGEBRAS OF TYPE (1, 1) 139 


Proof. Suppose that B is nil (that means every element of B is nilpotent). 
Theorem 6 implies the existence of another maximal right ideal B’ # B. 
Therefore B + B’ = R. Since the unit element 1 is in R, there must exist 
elements x in B and y in B’, such that 1 = x + y. Then 1 — x = y. Suppose 
that x* = 0. Lets = 1+2x+...+2*"'. Then (1 — x)s = 1 = ys. But this 
implies that 1 is in B’, so that B’ = R, contrary to assumption. Thus B cannot 
be nil. On the other hand Theorem 5 tells us that A? = 0, so that A is certainly 
nil. Therefore B # A. Because of Theorem 6, E # A and D # B. Clearly A 
is contained in £, and D is contained in B. From the Corollary to Theorem 7 
it follows that A is contained in B and hence in D. But then E must be con- 
tained in B. So far all the inclusions have been proper. However the best we 
can say about E and D is that E is contained in D, but in this case we are 
unable to eliminate the possibility that E = D. This completes the proof of 
the theorem. 

Fig. 1 indicates the simplest possible structure of any ring R satisfying the 
hypothesis of Theorem 8, if indeed such a ring exists. The B, denote maximal 
right ideals and the A, minimal right ideals. D and E are defined in the state- 
ment of Theorem 8. 


4. Main section. We shall make use of the following theorem, whose proof 
appears in (4). 


THEOREM 9 (Kokoris). Let R be a simple ring of type (1, 1) that is not asso- 
ciaiive, and e any idempotent of R. Then e must be the unit element 1 of R. 


There appears to be a minor gap in Kokoris’ proof, but fortunately a simple 
permutation of the facts already proved in (4) can be used to prove Theorem 9. 
We proceed with the details. In the proof of his Lemma 3, the element a = xy, 
where x is in Ryo and y is in R,; is not the most general element of Go. Rather 
Gy» consists of sums of such elements and hence one can only say that G» is 
the sum of nilpotent elements rather than that Gp is nil. Let us consider the 
case when R is simple and H = R. Then Ri; = RoRoo. Moreover, it is proved 
that R;; commutes with Ro: Roo, so that R,; is commutative. Now the fact that 
Ry, is the sum of nilpotent elements suffices to establish that R,, is nil, and this 
of course contradicts the fact that e is in Ry. 

We shall also make use of the following result about algebras of type (1, 1) 
(understood to be finite dimensional), whose proof may be found in (1). 


THEOREM 10 (Albert). A mil algebra of type (1, 1) is nilpotent. With this 
background we are ready to prove the result stated in the title of the present paper. 


MAIN THEOREM. Simple algebras of type (1, 1) are associative. 


Proof. Let R be a simple algebra of type (1, 1) that is not associative. We 
shall attempt to show that R satisfies the hypothesis of Theorem 8, but not 
one of the conclusions, thus obtaining the necessary contradiction. If R were 
nil then it would be nilpotent. Since R? is an ideal of R, either R? = Ror R? = 











140 ERWIN KLEINFELD 


0. If R? = 0, then R would be associative, contrary to assumption. On the 
other hand R? = R would contradict the fact that R is nilpotent. Therefore 
R is not nil. Suppose x is some element of R that is not nilpotent. The sub- 
algebra S of R that is generated by x therefore cannot be nil. Since S is a 
finite dimensional, associative algebra it must contain an idempotent e. But 
then Theorem 9 implies that e = 1. Thus R contains a unit element. Since R 
is a finite dimensional algebra with unit element it clearly has ascending and 
descending chain conditions on right ideals. Thus R satisfies the hypothesis 
of Theorem 8. Let B be any maximal right ideal of R. If 1 were an element in 
B then we would have B = R, a contradiction. Hence | is not an element of B. 
Let y be an arbitrary element of B and 7 the subalgebra generated by y. If 
T were not nil then it would have to contain an idempotent. However, that 
is impossible since Theorem 9 limits any idempotent in B to be 1, and we 
have already seen that 1 is not in B. Thus T is nilpotent, which implies that 
B is nil. We have reached a contradiction since one of the conclusions of 
Theorem 8 states that B cannot be nil. The contradiction arose from the 
assumption that R was not associative. Therefore R is associative. This con- 
cludes the proof of the theorem. 

Once it is known that simple algebras of type (1, 1) are associative, it is 
easy to extend this result to semi-simple algebras. The radical may be defined 
as the maximal nil ideal. One such argument follows closely the one given in 
(3) for algebras of type (7, 5), where y # 1, —1, and need not be repeated 
here. 

At this point we shall demonstrate how the main theorem carries over to a 
large extent to rings without finiteness assumptions. This also results in a 
second and somewhat more direct proof of the main theorem (Corollary 3 of 
Theorem 11). As usual a ring R is defined to be primitive in case it has a 
maximal right ideal A, which contains no two-sided ideal of R other than the 
zero ideal and in case there exists an element e in R such that ex — x is always 
in A for all x in R. 


THEOREM 11. Jf R ts a primitive ring of type (1, 1) then R ts associative. 
COROLLARY 1. If R is a semi-simple ring of type (1, 1) then R is associative. 


COROLLARY 2. If R is a simple ring of type (1, 1) and contains an idempotent 
then R is associative. 


Coro.iary 3. If R is a simple, finite dimensional algebra of type (1, 1) then 
R is associative. 


Proof. Let A be a regular maximal right ideal of R which contains no two- 
sided ideal of R other than the zero ideal and assume that R is not associative. 
We assert that there exists at least one element u of the form u = (x, y, x) 
which is not contained in A. For assume otherwise. Then (y, x, x) must also 
be in A. Replacing x by x + z it then follows that 2(y, z, x) is in A, for all 




















ALGEBRAS OF TYPE (1, 1) 141 


elements x, y and z in R. Now it is well known, and can easily be verified 
directly, that in an arbitrary ring all finite sums of elements of the form 
(R, R, R) and (R, R, R)R form a two-sided ideal J of R. In this instance J 
would be contained in A. But then by assumption we would have J = 0, and 
R would be associative. This is clearly a contradiction. Hence there must 
exist an element u = (x, y, x) which is not in A. Let C be the right ideal 
generated by u. Since A is a maximal right ideal it follows that A + C = R. 
Then we can find an element a in A and an element c in C such that e = a + c. 
Forming ex — u = au + cu — u, we note that cu = 0 as a result of Theorem 
2, while ex — u is in A. Therefore au — u must be an element of A. Since A 
is a right ideal au belongs to A, hence u must also be in A. But this is clearly 
a contradiction, since we deliberately chose u not in A. Hence R must have 
been associative to begin with. This completes the proof of the theorem. 

Making use of the Jacobson-Brown radical of a ring it is clear that a semi- 
simple ring is a subdirect sum of primitive rings, so that Corollary 1 follows 
at once from the theorem. 

If R is a simple, non-associative ring of type (1, 1) and contains an idem- 
potent, then as a result of Theorem 9 R must contain a unit element 1. But 
then form a maximal right ideal not containing 1. This must indeed be a 
regular, maximal right ideal of R. It contains no ideal of R other than zero 
since R is simple. Therefore R is primitive, and hence associative as a result 
of the theorem. This is a contradiction. Hence R must have been associative 
to begin with. This establishes Corollary 2. 

If R is a simple, finite dimensional algebra of type (1, 1) then, as in the 
early part of the proof of the main theorem, R is either associative or contains 
an idempotent. Then one may use Corollary 2 in order to establish Corollary 
3. 

Rings of type (1, 1) with radical need not be associative, of course. In fact 
it is not difficult to construct finite dimensional algebras of type (1, 1) which 
are not associative. It is worth noting that there exists a division ring of 
characteristic 2 which satisfies both (1) and (2’), yet is not associative (5). 


5. Rings of type (7, 5), y # 1, —1. Aring is said to be of type (v7, 4) in case 
identities (2), and (15), which follows, hold. Identity (15) is given by 


(15) J (x, y, 2) = v(x, 2, y) + 5(y, 2, x) + (2, x, 9) = 0, 


where x, y, z are arbitrary elements of the ring and +, 6 are constant scalar 
elements. One may also assume that y? = 6? — 6 + 1, for otherwise one can 
readily verify the ring to be associative. Therefore the condition that y = 1, 
—1 is equivalent to the condition that 6 # 0.1. In the remainder of this section 
we shall consider rings R of type (y, 5), y # 1, —1 whose characteristic is 
different from 2 and 3. We shall first develop some essential identities. As 
was shown in (2), 











142 ERWIN KLEINFELD 
(16) G(w, x, y, 2) = (w, (x, y, 2)) — (x, (y, 2, w)) + (y, (2, w, x)) 
— (z, (w, x, y)) = 0, 


and since this may be proved in a ring satisfying (2) only, it must be satisfied 
by all elements of R. From 


0 = G(y, x, x, x) + (x, A(x, y, x)) 
= (y, (x, x, x)) — (x, (x, x, y)) + (x, (x, y, x)) — (x, Cy, x, x)) 
+ (x, (x, y, x) +(x, (y, x, x)) + (x, (x, x, y)) 
= 2(x, y, x)), 
it follows that (x, (x, y, x)) = 0. But then 
0 = (x, J(x, x, y)) 


= (x, (x, y, x)) + 5(x, (x, y, x)) + (x, (y, x, x)) 
= (x, (y, x, x)). 


But then 


0 = (x, A (x, y, x)) 
= (x, (x, y,x)) + (x, (y, x, x)) + (x, (x, x, y)) 
( 


If in the last two identities we replace x by x + 2 and x — z and add, we 
obtain 


(17) K(x, y, z) = (x, (y, x, 2)) + (x, (y, 2, x)) + (2, (y, x, x)) = 0, 
and 
(18) L(x, y, 2) = (x, (2, x, y)) + (x, (x, 2, y)) + (2, (x, x, y)) = 0. 
Now let 
t = (x, (x, y, z)) + (x, (x, 2, y)), and uw = (x, (z, y, x)) + (x, (y, 2, x)). 
Then 
0 = G(x, x, y, z) — K(x, 2, y) + L(x, y, 2) 
= (x, (x, y, 3)) — (x, (y, 8, x)) + (Cy, (s, x, x)) — (2, (x, x, y)) 
— (x, (s, x, y))— (x, (s, y, x)) — (y, (2, x, x)) + (x, (s, x, y)) 
+ (x, (x, 2, y)) + (s, (x, x, ¥)) 
= (x, (x, y, 2)) + (x, (x, 2, y)) — (x, (2, y, x)) — (x, (y, 2, x)) 
=t—u. 


Consequently ¢ = u. On the other hand 


0 


J(y, x, 2) + J(z, x, y) 
v(y, 2, x) + 5(x, 2, y) + (2, y, x) + y(z, y, x) + (x, y, z) + (y, 2, x) 
(y + 1)[(, y, x) + (y, 2, x)] + S[(x, y, 2) + (x, 2, y)]. 


ll 








ALGEBRAS OF TYPE (1, 1) 143 


Commuting both sides with x one obtains (y + 6 + 1)t = 0, since ¢ 
However 
0 = J(x, y, 2) + J(x, 2, y) — A(x, y, 2) — A(x, 2, y) 
(x, 2, y) + 5(y, 2, x) + (2, x, y) + v(x, y, 2) + S(z, y, x) 
+ (y, x, 2) — (x, y,2) — (y,2,x) — (2, x, y) — (x, 2, y) 
— (z, y,x) — (y, x, 2) 
(y — 1)[(x, y, 2) + (x, 2, y)] + (6 — 1)[(z, y, x) + (y, 2, x)). 


il 
= 


Commuting both sides with x one obtains (y + 6 — 2)t = 0. Since both 
(y +6 + 1)t = 0, and (y + 6 — 2)t = 0, it must be that 3¢ = 0, and so 
t = 0. Since u = t, we also have u = 0. We have shown that 


(19) (x, (x, y, 2)) + (x, (x, 2, y)) = O = (x, (2, y, x)) + (x, (y, 2, x)). 


Incidentally up to this point we have made no use of the restriction on y. 
However, the next result makes use of this assumption. 


THEOREM 12. Let R be a simple ring of type (vy, 6), y #1, —1 that is not 
associative. Then R has no proper left or right ideals. 


Proof. Let B be any proper right ideal of R. Define S as the set of all elements 
s of B with the property that Rs is always contained in B. Let x, y, z denote 
arbitrary elements of R, a, b arbitrary elements of B and s an arbitrary element 
of S. Since B is a right ideal of R, (6, x, y) must be an element of B. But then 
0 = J(b, y, x) — A(6, y, x) 

= (b,x, y) + 8(y, x, b) + (x, b, y) — (6, y, x) — (y, x, b) — (x, b, y) 

= (6 — 1)(y, x, 6) + (0, x,y) — (6,9, x), 


so that (6 — 1)(y, x, 6) is in B. Since 6 # 1, this implies that (y, x, 5) is in B. 
Expanding 0 = A(d, x,y), we note that also (y, b,x) is in B. Clearly S is 
closed under subtraction. We now show that in fact S is an ideal of B. Since s 
is in B, sy will be also. Then 2(sy) = —(z, s, y) + (zs)y. We have already 
noted that (z, s, y) is in B. Also it follows from the definition of S that zs 
is in B. Since B is a right ideal of R, (zs)y must be in B. Thereby 2(sy) is also 
in B. But this implies that sy is in S, so that S is a right ideal of R. Again from 
the definition of S it follows that ys is in B. Then 2(ys) = —(z, y, s) + (zy)s, 
and so 2(ys) is also in B. But then ys is in S and therefore S is a two-sided ideal 
of R. However, B is a proper right ideal of R and S is contained in B. Conse- 
quently, since R is simple, S = 0. Next we proceed to show that a number of 
elements are zero by virtue of the fact that we can prove they are contained 
in S. Thus 0 = F(x, y, a, b) = (xy, a, 6) — (x, ya, b) + (x, y, ab) — x(y, a, ) 
— (x, y, a)b, implies that —x(y, a, 6) is contained in B. But then (y, a, 6) is 
contained in S and hence (R, B, B) = 0. But then 

0 = J(x, a, b) — A(x, a, b) 
v(x, b,a) + 5(a, b,x) + (b,x, a) — (x, a,b) — (a, b,x) — (b, x, a) 
= (6 — 1)(a, b,x). 


—_— 








144 





ERWIN KLEINFELD 


Since 6 # 1, (B, B, R) = 0. At this point 


0 = A(x, a,b) = (x, a, b) + (a, b, x) + (6, x, a) = (b,x, a), 


so that (B, R, B) = 0. In summary, we have shown that 


(20) 


Set x = b, y = z = x in (19). Then we obtain (0, (6, x, x)) + (6, (6, x, x)) 


(B, B, R) = (B, R, B) = (R, B, B) = 0. 


0. Hence (8, (6, x, x)) = 0. We shall now establish that 


(21) 


So far we have been able to show that the second and third terms of (21) are 


equal. 


because of (20). Thus the first term of (21) is equal to the second term. This 


(b?, x, x) = b(b,x,x) = (b, x, x)bd. 


= F(b, b, x, x) 


= (b?, x, x) — (6, bx, x) + (6, b, x?) — b(b, x, x) — (6, b, x)x 


= (b*,x,x) — b(b, x, x), 


establishes (21). Since 


0 = J(x, y, x) 
(x, x, y) + 5(y, x, x) + (x, x, y) 


(y + 1) (x, x, y) + S(y, x, x) 


and y # —1, we have 


6 
x,x,y=- (2) » X, xX). 
( y ¥ + l y 


But then substituting y = 5? we obtain 


(x,x,8) = — (3 5), 2,2). 
Y 


Similarly, substituting y = 8, 


and so 


But we have already seen in (21) that (b?, x, x) = (b,x, x)b. Then we may 
conclude 


0 


6 
(x,x,b) = — (+o, x, x) 
y¥+1 


(x, x, b)b 


ll 
| 
oo, 
“a 
+|/ o> 
| 
— 
, 4 
L 4 
3B 
LJ 
~~ 


that (x, x, b®) = (x, x, 6)b. Expanding 
F(x, x, 6, b) 


= (x*, b, b) — (x, xb, b) + (x, x, 6?) — x(x, b, b) — (x, x, b)d, 








are 


‘his 


nay 











ALGEBRAS OF TYPE (1, 1) 145 


we see that — (x, xb, 5) = 0, as a result of the previous identity and (20). But 
then 


0 = J(b, x, xb) 
= (6, xb, x) + 6(x, xb, b) + (xb, b, x) 
= (6, xb, x) + (xb, b, x). 
However, 
0 = F(6b, x, b, x) 
= (bx, b,x) — (b, xb, x) + (6, x, bx) — b(x, b,x) — (b, x, b)x 
= —(b, xb, x) — b(x, b, x). 
This implies that (6, xb, x) = —6(x, b,x). But then —~yb(x, b, x) + (xb, 5, x) 
= 0). From 
0 = F(x, b, b, x) 


= (xb, b,x) — (x, b?, x) + (x, b, bx) — x(b, b, x) — (x, b, b)x 
= (xb, b,x) — (x, 5, x) 


it follows that (xb, b,x) = (x, 6?, x). Thus —~ybd(x, b, x) + (x, 6°, x) = 0. In 
0 = J(x, x, y) 
= (x, y, x) + 6(x, y, x) + (y, x, x) 
= (y + 5) (x, y, x) + (y, x, x), 


substitute y = 5? to obtain (y + 6)(x, b?, x) + (6%,x,x) = 0, and also 
(y + 6)b(x, b, x) + b(b, x, x) = 0. We have already established in (21) that 
(b?, x, x) = 6(b, x, x), so that (y + 4) (x, b?, x) = (v7 + 4)d(x, b,x). Ify +6 = 


0, then substituting in y? = 6? — 6 + 1 wesee that 6 = 1, contrary to assump- 
tion. Therefore (x, b?, x) = b(x, 6, x). Since —~yb(x, 6, x) + (x, 6?, x) = 0 has 
already been established, we combine the last two identities and get (1 — y) 
b(x, b, x) = 0. But y # 1, so that d(x, b,x) = 0, and hence all the terms in 
(21) are zero. Replacing x by x + y in our last identity we see that 

(22) b(x, b, y) = —bd(y, 5, x). 


We showed that (x, xd, 6) = 0 earlier in the proof. On the other hand 
0 = F(x, b, b, x) 


(xb, b, x) — (x, b?, x) + (x, b, bx) — x(b, b,x) — (x, 5, b)x 
(xb, b, x). 


But then 


0 = J(xb, b,x) — 6A(b, x, xd) 
= >(xb, x, b) + 5(b, x, xb) + (x, xb, b) — 5(b, x, xb) — 5(x, xb, 5) 


— 5(xb, b, x) 
= (xb, x, bd). 











146 ERWIN KLEINFELD 


Since y # 0, (xb, x, 6) = 0. Substituting x + y for x in this last identity we 
get 
(23) (xb, y, b) = —(yb, x, b). 
In the second part of (19) replace x by w + x, so that 
(w, (2, y, x)) + (x, (2, y, w)) + (x, (y, 2, w)) + (w, (y, 2, x)) = 0. 
Now let w = z = b. Then because of (20), (d, (6, y, x)) + (0, (y, 6, x)) = 0. 
From (19) and (2) one proves that (z, (y, z, x)) + (z, (x, z, y)) = 0. Then if we 
let z = 5, (b, (y, b, x)) + (6, (x, 6, y)) = 0. But then 
0 = (b, J(x, y, 5)) 
= (6, y(x, b, vy) + 6(y, b, x) + (6, x, y)) 
= (y — 6 — 1)(8, (x, b, y)). 
If y =6+1 and y* = 6? — 6+ 1 then 36 = 0 so that 6 = 0, contrary to 
assumption. Therefore (5, (x, 6, y)) = 0. From this it follows readily that 
(24) (b, (6, y,x)) = 0 = (8, (y, 6, x)). 
Then 
0 = F(6, y, x, d) 
= (by, x, 6) — (6, yx, b) + (b, y, xb) — bly, x, b) — (6, y, x)d 
= (b, y, xb) — bly, x, b) — (6, y, x)bd. 
Now using (24), 
b(y, x, b) + (6, y, x)b = by, x, b) + b(b, y, x) = bJ(y, x, b) — B(x, d, y). 


Hence (b, y, xb) = —b(x, b, y) = b(y, b,x), using (22). We have demon- 
strated that 

(25) (b, y, xb) = d(y, b, x). 

Now 


0 = F(y, 5, x, b) 
= (yb, x, b) — (y, bx, b) + (y, b, xb) — y(b, x, 6) — (y, b, x)db 
(yb, x, b) + (y, b, xb) — (y, b, x)d. 


II 


Therefore (yb, x, 6) + (y, 6, xb) = (y, 6, x)b = b(y, b,x), as a result of (24). 


But then (y, b,xb) = —(yb, x, b) + b(y, b,x) = (xb, y, b) — b(x, b, y), be- 
cause of (23) and (22). We have demonstrated that 

(26) (y, 6, xb) = (xb, y, b) — d(x, b, y). 

Now 


0 = F(b, x, b, y) + A(x, y, d) 
= (bx, b, y) — (b, xb, y) + (6, x, by) — (x, b, y) 
— (b,x, b)y + (xb, y, b) + (y, 6, xb) + (b, xb, y) 
= —b(x, b, y) + (xb, y, b) + (y, 5, xd). 
But then comparison of the last identity with (26) shows that 
(27) (xb, y, 6) = b(x, b, y), 











ve 











ALGEBRAS OF TYPE (1, 1) 


and 
(28) (y, 6, xb) = 0. 
From 0 = J(6, xb, y) = y(b, y, xb) + 5(xb, y, b) + (y, b, xb), we get y(d, y, xd) 
+ 5(xb, y, 6) = 0, using (28). But then as a result of (22), (25), and (27) 
(y — 4)b(y, b,x) = 0. Since y # 6, then b(y, b,x) = 0. As before one may 
also deduce 6(6, x, y) = 0, by use of (2) and (15). But then 
0 = F(b, b, x, y) 

= (6°, x,y) — (0, bx, y) + (b, b, xy) — b(b, x, y) — (6, b, x)y 

= (57, x, y). 
Again using (2) and (15) one may deduce that (x, y, 6”) = (y, 6?, x) = 0. In 
other words 6? must lie in the nucleus V of R. Furthermore it follows from an 
argument presented in the Appendix of (4) that therefore (6?, R) = 0. But 
then we have 5? in B and xb? = 5*x is also in B, so that bd? is in S. Therefore 
b? = 0. Replacing 6 by a + 5 we see that 


(29) ab + ba = 0. 


Now 
0 = K(x, a, d) 
= (x, (a, x, b)) + (x, (a, b,x)) + (8, (a, x, x)) 
= (0b, (a, x, x)). 


But then (6, (x,a,x)) = 0, as a result of (2) and (15). On the other hand 


(x, a, x) is in B, so that b(x, a,x) = —(x,a,x)b, using (29). Consequently 
(30) b(x, a,x) = 0 = (x,a,x)b. 
Then 

0 = F(x, a, x, b) 


Il 


(xa, x, b) — (x, ax, b) + (x, a, xb) — x(a, x, b) — (x, a,x)b 
(xa, x, 6) + (x, a, xb). 


As a result of substituting y = x in (28) we obtain (x, 6, xb) = 0. At this 
point replace 6 by a + 6 in the last identity. Then one gets (x, a, xd) = 
— (x, b, xa). Therefore (xa, x, b) = (x, 6, xa). Adding to the last identity 
0 = A(xa, x, b) = (xa, x, b) + (x, b, xa) + (b, xa, x), 
we get 
2(xa, x, b) + (b, xa, x) = 0. 
From 
0 = F(b, x, a, x) 
(bx, a,x) — (b, xa, x) + (b,x, ax) — b(x, a,x) — (b,x, a)x 


= —(b, xa, x) 


one may now deduce that 2(xa, x, b) = 0, so that (xa, x, 6) = 0. But we have 











148 ERWIN KLEINFELD 


noted previously that (xa, x, 6) + (x, a, xb) = 0. Hence 
(31) (x, a, xb) = 0. 
We note that 
0 F(x, a, y, 5) 
(xa, y, b) — (x, ay, b) + (x, a, yb) — x(a, y, b) — (x, a, y)b 
(xa, y, 6) + (x, a, yb) — (x, a, y)b. 


However, replacing x by x + y in (31) we see that (x, a, yb) = —(y, a, xb). 
Replacing 6 by a + 3} in (28) shows that —(y, a, xb) = (y, 6, xa). Therefore 
(x, a, yb) = (y, 6, xa). Consequently (xa, y, 6) + (y, 6, xa) = (x, a, y)b. Sub- 
tracting 0 = A(xa, y, b) = (xa, y, b) + (y, 6, xa) + (6, xa, y) from the last 
equation we see that —(b, xa, y) = (x, a, y)b. Because of (29) it follows that 
(x, a, y)b = —b(x, a, y), and thereby (6, xa, y) = (x, a, y). Comparing the 
last identity with 

0 F(6, x, a, y) 
(bx, a, y) — (b, xa, y) + (6, x, ay) — b(x, a, y) — (6, x, a)y 
= —(b, xa, y) — b(x, a, y), 


ll 


we conclude that (), xa, y) = b(x, a, y) = 0. Consequently (B, RB, R) = 0. 
Define D = B + RB. It is a simple matter to verify that D is a two-sided 
ideal of R. Since D contains B, a non-trivial right ideal of R, it must be that 
D = R. Therefore (B, RB, R) = 0 and (B, B, R) = 0 imply (B, R, R) = 
(B, D, R) = 0. But then B is contained in the nucleus N of R and as before 
then (B, R) = 0 follows from (2). This, however, suffices to show that B 
is contained in S, so that B = 0. But this is clearly a contradiction. It arose 
out of the original assumption that B was a proper right ideal of R. Therefore 
R can have no proper right ideals. The argument that R can have no proper 
left ideals follows from the fact that a ring of type (vy, 4) is anti-isomorphic to 
one of type (—y, 1 — 5) (4, Theorem 1). This completes the proof of the 
theorem. 

We have purposely omitted from our discussion the rings of type (—1, 1) 
and their anti-isomorphic copies, the rings of type (1,0). The former are 
right alternative rings that satisfy (2). From the structure theory of right 
alternative algebras it follows that simple algebras of type (—1, 1) whose 
characteristic is different from 2 and 3 are associative. This result is analogous 
to our Main Theorem. Rings of type (—1, 1) have been considered by Maneri 
for his PhD. dissertation. His results will be published elsewhere. 


REFERENCES 


A. A. Albert, Almost alternative algebras, Portugal. Math., 8 (1949), 23-36. 

E. Kleinfeld, Rings of (y, 5) type, Portugal. Math., 18 (1959), 107-110. 

. L. A. Kokoris, On a class of almost alternative algebras, Can. J. Math., 8 (1956), 250-255. 

. —— On rings of (y, 6) type, Proc. Amer. Math. Soc., 9 (1958), 897-904. 

. R. L. San Soucie, Right alternative division rings of characteristic 2, Proc. Amer. Math. Soc., 
6 (1955), 291-296. 


Aaron 


Ohio State University 








vi 


s 
V 
E 
¢ 
c 
I 
I 
é 
( 





———————@weo“~™ 





THE GROUPS OF REGULAR COMPLEX POLYGONS 
D. W. CROWE 


1. Introduction. The two-dimensional unitary space, U2, is a complex 
vector space of points (x, y) = (x; + ixe, y1 + ty2), for which the distance 
between (x, y) and (x’, y’) is defined by [(x — x’) (x — x’) + (y—y')(v—y)}}. 
A unitary transformation is a linear transformation which preserves distance. 
A line is the set of points (x, y) satisfying some complex equation ax + by = c. 
A unitary transformation is a (unitary) reflection if it is of finite period n > | 
and leaves a line pointwise invariant. Thus a unitary matrix represents a 
reflection if its two characteristic roots are 1 and a complex mth root (m > 1) 
of 1. 

Shephard (7) has introduced the notion of regular complex polygon as follows. 
Consider a configuration P consisting of points (‘‘vertices’’) and lines (“‘edges’’) 
in U;. If the group of automorphisms of P is generated by two reflections, one, 
say S, which permutes cyclically the vertices on an edge and another, 7, 
which permutes cyclically the edges at one of these vertices, then P is called 
a regular complex polygon. Now the finite groups in U, generated by S and T 
can be interpreted as finite groups of orthogonal transformations in four- 
dimensional Euclidean space, Ey. These groups in E, have been enumerated 
by Seifert and Threlfall (6), using the fact that each such group is homo- 
morphic (either 2:1 or 1:1) to one of the finite groups of displacements in 
elliptic space of three dimensions enumerated by Goursat (5). The purpose 
of this paper is to find the groups in Goursat’s list corresponding to Shephard’s 
groups. 

In §2 we find quaternion transformations g’ = agb corresponding to the 
generators of Shephard’s groups. In §3 these are used to associate groups 
¥ and ® of Clifford translations to Shephard’s groups. (This discussion closely 
follows that of (6).) In §4 Goursat’s groups are described analogously, leading 
to the natural homomorphism between Shephard’s groups and Goursat’s 
described in §5. The results are tabulated, and summarized in the Theorem. 

We write ©, and € for cyclic groups of order m and 1 respectively. The 
polyhedral group defined by A* = B’ = (AB)? = E is denoted (2, yu, v), and 
the binary polyhedral group A* = B’ = (AB)? is (2,y,¥v). With ©, the 
latter are the only finite groups of quaternions. For quaternions the exponential 
form exp srj/n means cos sx/n + j sin sx/n. The order of a group @ is |G}. 





2. The quaternion representation of a unitary reflection. If the 


Received June 29, 1959. This paper is a portion of the author’s Ph.D. thesis at the University 
of Michigan, prepared under the direction of Professor H. S. M. Coxeter. 


149 














150 D. W. CROWE 


point (x,y) = (x; + ixe,¥1 + iy2) in Us. is represented by the point 
(x1, X2, V1, ¥2) in Ey, then the transformation 
(2.1) (xi + ix2, yi + tyd) = (x, y) (" + ire St is) 

ti + tte uy + the 


is represented by the transformation 


(2.2) (x1, X2, Vi, ¥2,) = (X1, X2, V1, V2) 


—te ti— Ue uj 


In particular, if 2.1 is a unitary reflection then 2.2 is proper orthogonal. The 
transformation 2.2 can in turn be expressed as a quaternion transformation (2) 


(2.3) q’ = (a; a ia, + ja; a ka;) q (db; a ibs a jd; + kd,), 
where g = xX; + ix. + jy: + Kyo, g! = x1 + ixe’ + jy:’ + ky:’, and Na = 


Nb = 1. Since in our case 2.2 corresponds to a unitary reflection, we have also 
a, = by (2, p. 141). 

The a; and 6, in 2.3 can be found in terms of the r;, s;, t;, and u; in 2.2 by 
applying 2.2 and 2.3 to (1,0,0,0), (0,1,0,0), (0,0,1,0), (0,0,0,1) and 
1, i, j, k respectively, and equating coefficients. For example, applying 2.2 
and 2.3 to (1, 0, 0,0) and 1 yields 

ry = 4b; — de2b2 — A3b3 — aba, 2 = Ayb2 + Gob, + Ashby — abs, 
$1 = Ayb3 + 3b, — Aobg + agbe, So = Aybg + aghy + God3 — agbr. 


II 


Repeating this in the other three cases yields 12 more equations. Adding and 
subtracting these 16 equations in pairs containing 7, re, .. . , 1, Ue yields 16 
equivalent equations which reduce to 
a3; = a4 = O and 
2a,;b, = 71 + m1, 2aib3 = 5; — hh, 


2a ;b2 = 72 — te, 2a;b,4 = Se + to, 
2a2b; = 73 + Uo, 2b; = So — le, 
2a2b2 = U, — 71, 2dob, = = ee = ty. 


These equations readily give the quaternion transformation 2.3 correspond- 
ing to a unitary matrix. For example, if 


1 0 
ii ({ exp _ 


463 = Aob3 = Gib, = ob, = (). 


we get 


Since either a, or a: is different from zero this implies 6; = 6, = 0. Moreover, 
2a;b2 = — Ue = — 2a2b,, that is 


a;/d2 = — b,/b:, 









int 


he 


nd 
16 





REGULAR COMPLEX POLYGONS 


and since 


a,b; — eb, =7,;> 0 
we have 6 = da. Then 
2a,;b; = 2a = 7, + u, = 1+ cos2x/n, 


so that 
a, = +cos7r/n = by. 


We choose the upper sign. Finally 
2aeb, = 2aea; = 2a, cos r/n = sin 2x/n, 
which yields 


a2 = sing/n = —br. 
Consequently the quaternion form of T is 
(2.4) gq = (cos x/n + isin r/n) g (cos r/n — isin xr/n). 


In Table I at the end of this paper we list the groups of the regular complex 
polygons, writing ,[t]p2 for the group of the polygon ;{t} 2 in the notation 
of (4, p. 80). The generators S and T are taken from (7) except for the group 
2(n]|2 for which the given S is not a reflection. For this group we let 


cos 24/n sin 24/n “= 1 0 
s= sta (om , ; ) and T= ( 
sin 2x/n —cos2x/n 0 -!1 
Professor Coxeter has pointed out that the defining relations for Shephard’s 
groups are particularly simple in terms of S~' and 7. Consequently we use 


these generators in preference to S and 7. The quaternion form of S~' appears 
in the table. By 2.4, T is always (exp ri/p2) g (exp —7i/pe). 


3. The groups &, ®, |, r. If S“‘isrepresented by g’ = agband T by qd’ =cqd 
we designate by &* the group generated by a and c, and by §* the group 
generated by band d. These groups are either cyclic groups or binary polyhedral 
groups (2, u, v) (1), and can thus be readily determined. We remark that in 
all cases %* is cyclic. 

We turn now to a more detailed discussion of Shephard’s groups in terms 
of the groups &* and R*. Let G be such a group. Then @ is a group of trans- 
formations g’ = agb, a € %*, 6 € R*. But not every transformation of the 
given form is in the group, and furthermore there are certain redundancies. 
The latter occur because the transformation g’ = agb is identical to the 
transformation g’ = (—a)q(—b). There are no other redundancies of this 
nature, for if agb = cqd for all g then c~'a = db“ is a real number, say s, so 
that c = as-' and d = sb. But Na = Nb = Ne = Nd = 1, sos = + 1. We 
remove these redundancies by identifying the elements (a, 6) and (—a, —)) 
of 2* K R*. (Observe that multiplication in %* K R* is defined by (a, 5) 
(c, d) = (ac, bd), which does indeed correspond to composition of the corres- 








152 D. W. CROWE 


ponding transformations since the commutativity in %* implies ac = ca.) For 
finite groups of quaternions the only case in which this identification is trivial 
is if either 2* or R* is a cyclic group of odd order. For our groups this is never 
the case. Essentially we thus form (2* K R*)/€s, whose elements are the 
classes { (a, b)}. We let 


@ = {{(a, 1)} :a € M*} and RK = {{ (1, 5)} -b € R*}. 
The elements of £ and ® can be multiplied in the obvious manner: 
{(a, 1)} {(1, 6)} = {@, d)}. 


Clearly R > R* and &¥ > &*, but LM —S (L* K R*)/ Gs. Every element of & 
commutes with every element of R. The group G is isomorphic to a subgroup 
of 2R and will be treated as if it were itself a subgroup. 

Let | = £7) @ and r = R\G. Then | is a normal subgroup of &. For let 
L€ and / €f. Certainly Z-YL € &. There is some R € ® such that 
LR € G, for 2 consists of exactly such elements L. Consequently 


LiL = (LR)-U (LR) € G. 


That is, LIL € £(\@ = L. Similarly, t is a normal subgroup of ®. Further- 
more Ir is a normal subgroup in @ of order $|[\/r|. For if/ € landr € rthen 
(LR)—'"r (LR) = (L-UL) (R-'r R) € Ie. 

We say that an element L € & is paired with an element R € Rif LR «€ G. 
The cosets of | in & are in 1:1 correspondence (given by the pairing) with 
the cosets of rin ®. For if L and L’ are paired with Rthen LR, (LR)-'! = L-'R" 
and L’R are in @. Therefore L-'R-“'L’R = L-'L'R“'R = L-'L’ € G. That is, 
[-'L’ € I, and consequently LZ and L’ are in the same coset of {. Conversely, 
if L and L’ are in the same coset and if L is paired with R, that is, LR © G, 
we have L’L-'LR = L’R € G. That is, L’ is also paired with R. This corres- 
ponden«e is an isomorphism. For let ZL be in the coset corresponding to the 
coset containing R, that is, LR € G, and let L’R’ € G. Then 


LRL’'R' = LL'RR' € @ 


’ 


that is LL’ is in the coset corresponding to the coset containing RR’. 
This isomorphism 


(3.1) L/L R/r 


enables us to determine I and r. For each of the || elements of 2 is paired with 
the |r| elements of a coset of r in ®, and each of the |®| elements of R is paired 
with each of the |[| elements of a coset of [ in &. These pairings give all the 
elements of G, but each element appears twice, for if {(a@, 1)} is paired with 
{(1, 6)} then {(—a, 1)} is paired with {(1, —bd)} and {(a, 1)} {(1, 5)} = 
{{(—a, 1)} {(1, —d)} = {@, d)}. That is, |G) = 3/L||r| = 3/RI Il]. This gives |r; 
and |{| and consequently r and |, since in all but two cases the normal 
subgroups of these orders are unique. We discuss these two cases separately. 








fin 


~ a 


- OO 








REGULAR COMPLEX POLYGONS 153 


If G@ = 2[4]n, nm even, we have 2 = G2,, R = (2,2, m) and! = G&,,. In fact 
the 2” elements of 2% are 


exp sxi/n, s = 1,2,..., 2m. 


(Strictly speaking, these are the elements of %*. But & = &* and it is simpler 
to write +a than { (+a, 1)}. The same applies to R and R*.) The 4n elements 
of ® are 

exp sri/n and k exp sri/n, s = 1,2,..., 2m. 


The possible choices of tr, that is, of normal subgroups of index 2 in ®, are 
€:, with elements exp sri/n, s = 1,2,..., 2m, and (2, 2, n/2) with elements 


exp 2sri/nm and k exp 2sri/n, s = 1,2,... ,m. 


We know © contains the element JT = { (exp ri/n, exp —7i/n)}. Since 
exp ri/n is not in I, but in the other coset of | in & we know exp —7i/n is 
not in tr. But exp —zi/n is in €,. Therefore r = (2, 2, 2/2). 

In the case of groups 2[n]2, m divisible by 4,whereY = Gy, R = (2, 2, n/2) 
and { = @s, it can be verified in a similar manner that r = ©, and not 
(2,2, 2/4). For T = { (i, —i)} is in G and i is not in [. But —i isin (2, 2, 2/4), 
sot = &,. (In this case the elements of ®R are exp 2sxj/n and i exp 2sxj/n, 
og oe ee 

Conversely, given the groups &%, ®, I, r from our list we can always determine 
@® uniquely. To show this we need only show that if distinct isomorphisms 3.1 
are chosen, the groups © arising from the corresponding pairings of cosets are 
isomorphic. In most cases #/r is ©, or €, and there is thus only one isomor- 
phism. However, there are cases where 


R, 5 = (2, 2, n)/ C, > C, 
R/r = (2, 3, 3)/(2, 2,2) = Gs. 


or 


There is only one non-trivial automorphism of 4, and it is induced by an auto- 
morphism a@ of (2, 2,2”), namely by ab = (i)b(—i), where the elements 6 of 
(2, 2, m) are again exp sri/n and k exp sri/n, s = 1, 2,... , 2n. This establishes 
an isomorphism between the two possible groups @ by the correspondence 
{(a, 6)} : { (a, ab)}. Similarly, the only non-trivial automorphism of (2, 3, 3) 
(2, 2, 2) is induced by the automorphism 


a (aig) 


of the group (2, 3, 3) whose elements are +1, +i, +j, +k, (4+1+i+j+k). 
That is, an isomorphism between the two groups obtained from the two 
possible pairings of cosets %/{ with those of (2, 3, 3)/(2, 2, 2) is determined 
by the correspondence { (a, b)} : { (a, 6 d)}. 


4. The relevant Goursat groups. Goursat (5) has shown that the 
finite groups of motions in elliptic 3-space can all be obtained in an analogous 














154 D. W. CROWE 


fashion from pairings of corresponding cosets of isomorphic quotient groups 
of polyhedral or cyclic groups. Explicitly, consider a polyhedral or cyclic group 
%’ with a normal subgroup ’ and a polyhedral or cyclic group ®’ with a normal 
subgroup r’ such that 


(4.1) W/ > R'/r’. 


Then |%’||r’| = |R’||l’| is the order of a group G’ whose elements are the pairs 
(a, 6) for which 6 is an element of the coset of r’ in ®’ which corresponds to 
the coset containing a in the isomorphism 4.1. The multiplication of elements 
of @’ is defined by (a, b) (c,d) = (ac, bd), and @’ is a subgroup of 2 K ®’. 

In all but one of the cases which concern us the quotient groups 4.1 are 
either ©, or €. Consequently the given construction for @’ is unambiguous. 
In the remaining case, where R’ = (2,3,3) and r’ = (2, 2, 2), there are two 
distinct ways of pairing the cosets of <’/I’ with those of ®’/r’. But these two 
pairings again lead to isomorphic groups, for there is an automorphism of 
(2, 3,3) which induces the non-trivial automorphism of (2, 3, 3)/(2, 2, 2) as 
follows. Let (2,3,3) be the group of a regular tetrahedron, and let y be a 
rotation by angle x about a line joining the midpoints of opposite edges of a 
cube whose vertices are those of the tetrahedron and its dual. Then the 
transformation yby— induces the non-trivial automorphism of (2, 3, 3) / (2, 2, 2). 
Consequently an isomorphism between the two possible pair groups is given by 
the correspondence (a, 6) : (a, yby~"'). 


5. The homomorphism from Shephard’s groups to Goursat’s. Let 
a group © of our list be given by 3.1. Consider the natural homomorphism 
from the cyclic or binary polyhedral group ¥ to the corresponding cyclic or 
polyhedral group ~’ obtained by identifying the elements +a of &. Let a’ be 
the image of +a under this homomorphism. Similarly let 5’ be the image of 
+b under the natural homomorphism of ® onto ®’. Clearly this induces a 
homomorphism from @ onto some group &’ whose elements are of the form 
(a’, 6’). Thus @’ is one of Goursat’s groups, for its elements are pairs from 
cyclic or polyhedral groups, and Goursat’s list includes all such. We distinguish 
two cases. (i) The elements +a are not in the same coset of | in &, and the 
elements +6 are not in the same coset of r in R. The only groups for which 
this occurs are 2[m|2, m odd, and 2[4]m, m odd, that is, the groups for which | 
and tr are cyclic groups of odd order. (ii) The elements +a are in the same 
coset of | in % and the elements +3 are in the same coset of t in ®. This is the 
situation for all other groups in our list. 

Now in case (i) the homomorphism described from @ to @’ is actually an 
isomorphism. For the only elements of © whose images are (a’, b’) are { (a, d)} 
and {(—a, —5)}, since if { (a, b)} is in @ there is no element { (a, —4)} in G. 
But { (a, b)} = {(—a, —d)}, so the correspondence between the elements of 
@’ and those of @ is 1:1. This determines the order of G’, as well as % and ®’, 
so that we can find [’ and r’ immediately. In fact, ! = Land r’ = r. 








- 





~~ 


REGULAR COMPLEX POLYGONS 


155 


In case (ii) the homomorphism of @ to @’ is 2:1 since the distinct elements 
| (a, +6)} of © both have (a’, b’) as image. In particular the image of (1, +d) 
is (1’, 6’), soifb € rthend’ € rv’. That is, r’ is the image of r under the homo- 























TABLE | 
Group Quaternion transformation ¢/I R/r 
lt] pe corresponding to S~' (§ 2) (§ 3) (§ 3) 
pilllP 
C>,/ €, (2, 2, n)/(2, 2, n/2) 
‘ (m even) 
2{4)n (i)g(—k) C., C,, (2,2 n) C,, 
(n odd) 
Q./G, (2,2, n/2)/G, 
Me ‘ ae (1 even) 
2[n]2 (i)g(-— i exp 2xj n) S, & (2 2,0 /, 
(m odd) 
3{6]2 (3 in v3), : = R+jy? = nV5) Gi:/C; (2, 3, 3)/(2, 2, 2) 
7 hu le . : a 
3/4]3 (3 + ¥3),(4 - E+ yw . nv?) Ge/Gs (2, 3, 3)/(2, 3, 3) 
2 2 6 9 
3{3)8 ( + rV3),(1 . i$ . yu - nv?) G./C. (2, 3, 3)/(2, 2, 2) 
3(8}2 (14 v3), (3 - pv? +i xV3) Ci2/Cs (2, 3, 4)/(2, 3, 3) 
4{6]2 (2 - v2), (2 -t +e - xv?) Gs/Gs (2, 3, 4)/(2, 3, 4) 
« “ v4 ) ) 
4[3}4 (2 + v2), (2 +2) Cs/C, (2, 3, 4)/(2, 3, 3) 
3{10}2 (2 ‘ :v3 (23 -+j-- x3) Ci</Cie (2, 3, 5)/(2, 3, 5) 
y y 2 2 2 T 
—- " 3. fz 
3[6)2 (z + 2) of = is + i; _ ne V5) 220 S20 (2, Aa ») (2, 3, 5) 
2 2 2 he 2 io 
5[4]3 (: + 2)a( 5 = ~ +3 +k? v15) Cs0/Cxo (2, 3, 5)/(2, 3, 5) 
“ o “ ) « 
315]3 (244 is (2 -iv®? +3 - x!) Se/Cs (2,3, 5)/(2, 3, 5) 
2 2 ) ) 2 
rrole a T orv/5 re ] - . 29 R\//O OF 
wis (5 + 12)a(5 — NE 4 it ah area eee eer 
*r = 2cos 4/5 = (1 + +/5)/2. to = 2sin r/5 = (3 — r)! 

















156 D. W. CROWE 


morphism taking +6 to 6’. This determines the group r’. Order consider- 
ations alone determine the cyclic group [’ for which 2|['| = |]. 

The groups %, ®, !, tr appear in Table I. These tabulations then readily 
yield the Goursat groups in the form 4.1. Except for the cases 2[4]n and 2[n]2 
the results are summarized in the Theorem, for which the following notatior. 
is convenient: 

Let p:[t]p2 be the group generated by reflections S-' and 7, having the 
defining relations 





(S-)" = 7 = FE, S'TS"'... = TS-“'T ... (¢ factors on each side). 
The centre of this group is the cyclic group 3 generated by (S-'T)'” if ¢ is 
even or by (S~'T)‘ if t is odd. The period of (S-'T)‘/ is 
2pibe k, where 2k = 2pibe + pit + Pol = Pipol 
(4, pp. 76, 77, 79). The quotient group ),{t] p2/3 is a polyhedral group 
(2, uw, v). (7, p. 84). 
THEOREM. The group p,(t|p2 (bi: # 2) with centre 3 is 2:1 homomorphi« 
to the group of motions in elliptic 3-space defined by the isomorphism 
LU = R’/r’, 
where 
(a) and V' are cyclic groups. 
(b) |2’| = Le.m.{p,, po}. 
(c) 2|l| = |3}. 
(d) R = p,[t}p2/3. 
(e) vw is the unique normal subgroup of KR’ such that |%'\\r'| = |R'||U'|. 
((a), (b), and (d) also hold in case p; = 2.) 
REFERENCES 


1. H. S. M. Coxeter, The binary polyhedral groups and other generalizations of the quaternion 
group, Duke Math. J., 7 (1940), 367-379. 


2. ——— Quaternions and reflections, Amer. Math. Monthly, 53 (1946), 136-146. 

3. ——— Regular polytopes (London, 1948). 

4. H. S. M. Coxeter and W. O. J. Moser, Generators and relations for discrete groups (Berlin, 
1957). 


5. E. Goursat, Sur les substitutions orthogonales et les divisions réguliéres de l’espace, Ann. Sci. 
Ecole Norm. Sup. (3), 6 (1889), 9-102. 
6. H. Seifert and W. Threlfall, Topologische Untersuchungen der Diskontinuitdtsbereiche 
endlicher Bewegungsgruppen des dreidimensionalen sphdrischen Raumes, Math. Ann., 
104 (1931), 1-70. 
- G. C. Shephard, Regular complex polygons, Proc. London Math. Soc. (3), 2 (1952), 82-97. 
8. G. C. Shephard and J. A. Todd, Finite unitary reflection groups, Can. J. Math., 6 (1954), 
274-304. 


“I 


University College 
Ibadan, Nigeria. 














—< 





mn 














DECOMPOSITION OF FINITE GRAPHS INTO OPEN 
CHAINS 


C. St. J. A. NASH-WILLIAMS 


1. Introduction. If m, n are integers, ‘‘m = n"” will mean “‘m = n (mod 
2)."’ The cardinal number of a set A will be denoted by |A|. The set whose 
elements are @;, da, . .. , @, will be denoted by {a;, a2,...,a,}. The empty set 
will be denoted by @. If A, B, C are sets, A — B will denote the set of those 
elements of A which do not belong to B, and A — B — C will denote (A — B) 
— C. The expression >x4f(t) will be denoted by f.A. The statements 
“f = gon A,” “f = gon A” will mean that f(¢) = g(é) or f(t) = g(€) res- 
pectively for every — € A. 

An unoriented graph U consists, for the purposes of this paper, of two dis- 
joint finite sets V(U), E(U), together with a relationship whereby with each 
\ € E(U) is associated an unordered pair of (not necessarily distinct) elements 
of V(U) which d is said to join. An oriented graph is a triple N = (U,t, h), 
where U is an unoriented graph and ¢t, 4 are mappings of E(U) into V(U) such 
that each A € E(U) joins Xt to Ak. We write V(U) = V(N), E(U) = E(N) 
and call At, Ak the tail and head of X respectively. Either an unoriented or an 
oriented graph may be referred to as a graph. Throughout this paper, U will 
denote an unoriented graph, N will denote an oriented graph, and G may 
denote either. The elements of V(G) and E(G) are called vertices and edges of 
G respectively. A subgraph of U is an unoriented graph H such that V(H) C 
V(U), E(A) C E(U) and each edge of H joins the same vertices in H/ as in 
U. A subgraph of N = (U,t, hk) is an oriented graph (U,, t:, 4;) such that U, 
is a subgraph of U and ¢;, A, are the restrictions of t, h respectively to E(U;). 
An orientation of U is an oriented graph of the form (U, t, 2). A vertex £ and 
edge A of G are incident if — is one or both of the vertices joined by A. The 
order, ord G, of G is |V(G) U E(G)|. G is empty if V(G) = E(G) = @. The 
degree d(é) of a vertex & of a graph is 2a(&) + 6(&), where a(&) is the number 
of edges joining £ to itself and 6(£) is the number joining £ to other vertices. A 
vertex is even or odd according as its degree is even or odd respectively. G is 
Eulerian if its vertices are all even. A collection.of subgraphs of G are disjoint 
(edge-disjoint) if no two of them have a vertex (edge) in common. The union 
of the subgraphs H,, H2,..., H, of G is the subgraph H of G such that 


V(H) = U V(,), EH)= U E(H)). 


A decomposition of G is a set of edge-disjoint subgraphs of G whose union is G. 
G is connected if it is not the union of two disjoint non-empty subgraphs. The 


Received December 7, 1959. 








158 C. ST. J. A. NASH-WILLIAMS 


components of a non-empty graph are its maximal connected subgraphs. (An 
empty graph is deemed to have 0 components.) A chain-sequence of G is a 
finite sequence 


£0, Ar, &1, Aa, E2, As, ..- , Ans En (n > 0) 


in which the &, are vertices of G, the \, are distinct edges of G and X, joins 
é,, to &, fort = 1,2,...,m. If Gis an oriented graph, this chain-sequence is 
forwards-directed if 


Ad¢ = €:-1,A A = &; (4 = i a 
and backwards-directed if 
Age = £4-1,Ad = &, eos ae =F 


A finite sequence is closed or open according as its first and last terms are the 
same or different respectively. If s is a chain-sequence of G, the subgraph of 
G formed by those vertices which appear at least once and those edges which 
appear exactly once in s will be said to be derived from s. A subgraph of G is an 
open chain of G if it is derivable from an open chain-sequence of G. If &, 9 
are the first and last terms of an open chain-sequence s of G and C is the open 
chain derived from s, then clearly ~, 7 are odd in C and every other vertex of C 
is even in C. It follows that an open chain has precisely two odd vertices which 
are the end-terms of every chain-sequence from which it is derivable; these 
are called the end-vertices of the open chain. If S, T are subsets of V(G), 8 will 
denote V(G) — S, So T will denote the set of those edges of G which join 
elements of S to elements of 7, and Sé will denote So 8. A subgraph of G is 
an ST-chain if it is derivable from a chain-sequence of G whose first and last 
terms belong to S, 7 respectively. A cincture of G is a subset of E(G) which 
is of the form Sé for some subset S of V(G). If & € V(N), an edge J is an 
exit of — if At = — and an entry of £ if Ak = &. The number of exits [entries] of 
— will be denoted by x() [e(é)]. The flux out of =, denoted by f(é), is 
x(t) — e(&). N is quasi-symmetrical if x = e on V(N). A route-sequence of N 
is a chain-sequence of N which is either forwards- or backwards-directed. A 
subgraph of NV is a route (closed route, open route) of N if it is derivable from 
a route-sequence (closed route-sequence, open route-sequence) of N. 

When, to avoid ambiguity, it is necessary to specify the graph relative to 
which a graph-theoretical symbol is defined, the letter denoting the graph 
will be attached to the symbol in some convenient way. For example, if ¢ 
is a common vertex of two oriented graphs M and N, d,y(é) will denote the 
degree of in M. We shall, however, make the convention that, in any context 
in which an oriented graph denoted by the letter NV is under consideration, all 
graph-theoretical symbols relate to N unless the contrary is indicated; for 
example, d(£) would mean dy(£) in the situation instanced above. 

Let s be a forwards-directed route-sequence of NV, R be the route derived 
from s and &, 9 be the first and last terms of s respectively. Then clearly R is 











of 











DECOMPOSITION OF FINITE GRAPHS 159 


quasi-symmetrical if € = » and fr(é) = 1, fr(n) = —1 and fy = Oon V(R) — 
{t, »} if £ # ». It follows that a closed route cannot also be an open route and 
that an open route R has uniquely determined vertices £, 7 such that fg(£) = 1, 
fr(n) =-—1 and &, n are the first and last terms respectively of every forwards- 
directed route-sequence from which R is derivable; we call £, 7 the tail and 
head respectively of R. 

By a G-function, we shall mean a non-negative integer-valued function 
defined on the vertices of G. A G-function g is congruential if g = d on V(G). 
If gis a G-function and ¢ € SC V(G), F,(&; S) will denote 


—g(é) +g. (S — {&}) + |.Sdl. 
We shall call g tolerable if F,(&; S) > 0 for every pair &, S such that & € SC 


Ss 
V(G). A subset S of V(G) is g-critical if F,(é; S) = 0 for some — € S. Acincture 
C of G is g-critical if C = Sé for some g-critical subset S of V(G). A g-chain- 
factor of G is a set ® of edge-disjoint open chains of G such that each vertex ¢ 
of G is an end-vertex of exactly g(é) elements of ©. A g-decomposition of G 
is a g-chain-factor of G which is a decomposition of G. 

Let u,v be N-functions. Then a (u, v)-route-factor of N is a set © of edge- 
disjoint open routes of V such that each vertex & of N is the tail of exactly 
u(t) and head of exactly v(é) elements of ®. A (u, v)-decomposition of N isa 
(u, v)-route-factor of NV which is a decomposition of NV. 

The object of this paper is to prove the following two parallel results: 


THEOREM 1. Let g be a U-function. Then U has a g-decomposition if and only 
if g is tolerable and congruential and g . V(H) > 0 for each component H of U. 


THEOREM 2. Let u, v be N-functions. Then N has a (u, v)-decomposition if and 
only if u + v ts tolerable, u — v = f on V(N) and (u + 0) . V(H) > O for each 
component H of N. 


Our procedure will be to prove Theorem 2 and deduce Theorem 1 from it. 
Certain generalizations of the theorems will be mentioned at the end of the 
paper. 

2. Proof of Theorem 2. 

LemMa 1. Jf G has a g-chain-factor, g is tolerable. 


Proof. Let ® be a g-chain-factor of G. For any pair of disjoint subsets S, 7 
of V(G), let S*T denote the number of S7-chains in ®. Then, if © S C V(G), 


g(t) = ({&}*8) + > ({&}*{n}). 
But {&}*{n} < g(n) for every » € S — {&}; and {&}+*8 < |S8| since & € S and 


so each {¢} 8-chain must include an element of Sé. Hence g(t) < g. (S — {€}) 
+ |S6|; and the lemma is proved. 


Lemma 2. Jf A, B are disjoint subsets of V(G), |\(A U B)é|+|Ad| > | Bal. 














160 C. ST. J. A. NASH-WILLIAMS 
Proof. |\f V(G) — (AU B) = C, the above inequality follows from the 
relations 
|Ad| = = |BoC| + |AoB|, |(A U B)é| =|CoA|+|BoC). 
Lemma 3. Jf S C V(G), |S8| = d.S. 





Proof. An edge contributes 2, 1, or 0 to d.S according as it belongs to 
SoS, Sé or SoS respectively. 


COROLLARY 3A. If g is a congruential G-function and &§ € SC V(G), 
F,(&; S) is even. 


Coro.iary 3B. (= (1, chapter 11, Theorem 3)). The number of odd vertices 
of a graph is even. 


Proof. Take S = V(G) in Lemma 3. 


Definition. Let X, u be distinct edges of N such that Ak = wt = &. Then 
the oriented graph M obtained from N by fusion of \ and u at & is defined by 
the rules: 

(i) V(M) = V(N), E(M) = [E(N) — {A, w}] U {vr}, where » is a newly 
added edge and is not an nv of the set V(V) U E(N); 

(ii) vty = Al, vhyy = wh; 

(iii) xty = xt, khy = xh for every «x € E(N) — {A, zp}. 


LemMA 4. If, in the circumstances of the above definition, g is a tolerable 
congruential N-function and no g-critical cincture of N includes both > and u, 
then g is tolerable in M. 


Proof. Leté © SC V(M) (=V(N)). IfA, « do not both belong to Sé, then 
|Sdue| = |.S6| and so yF,(é;S) = F,(&;S) > 0. If 4, uw both belong to S3, then 
(i) |Sdye| = |S8| — 2, whence yF,(¢;S) = F,(&;S) — 2, and (ii) Sé must not 
be g-critical, whence, by the tolerability of g and Corollary 3A, F,(é;.S) > 2 
Hence yF,(&; S) > 0. 


Definitions. If SC V(N), S* will denote the subgraph of N defined by 
V(S*) = S, E(S*) = SoS, and Ng will denote the oriented graph M defined as 
follows. 

(i) V(M) = 8 U {S’}, E(M) = SoV(N), where S’ [¢ V(N) U E(N)] is a 
newly introduced vertex. 

(ii) Write o(¢) = & if € € 8S and o(¢) = S’ if & € S. Then My = (At), 
hMhy = (Ah) for every X € E(M). 

Thus N sis obtained from N by contracting the subgraph S* to a single vertex 5”. 


LeMMA 5. Let g be a tolerable N-function and C be a g-critical subset of V(N). 
If g(C’), g(C’) are both defined to be |Ca|, then g is tolerable in Ne and Nz. 


Proof. Write Né = H, Ne = K. Since C is critical, 
(1) g(t) = g. (C — {&}) + |Co| 




















DECOMPOSITION OF FINITE GRAPHS 161 


for some & € C. Since g(C’) = g(C€’) = |Cal, (1) can be rewritten in each of the 
forms 


(1’) e(é) = g. (V(X) — {8} ], 
(1”) g(C’) = g(t) — g. (C — {&}). 

Lema 5A!. Jf S C V(A) — {&}, ¢.S < | Soy). 

Proof. Since F,(¢; C — S) > 0, 
(2) g(t) —g.(C—S— {&}) < |(C — Sj. 
if C’ ¢5, 

|Séu| = |S8| > |(C — S)d| — |Cd] > g(&) —g. (C —S— {&}) — |Cé] = g. S 
by Lemma 2, (2) and (1). If C’€eS. 

|\Sdw| = |(C — S)d| > g(&) —g.(C—S— {t}) =¢g.S 

by (2) and (1’). 


Suppose that Y C V(A). Let V(H) — Y = W. If ¢ ¢ Y, then, for every 
n € Y, 
af,(n; Y) > | Yéu| — g(n) > | Youx| — ge. Y>O 
by Lemma 5A. If — € Y, then by (1’), 
aF,(é; Y) = | You! —g.W = |Wiy| —¢.W>0 
by Lemma 5A, and, for every n € Y — {&}, 
#Fj(n; Y) > g.(Y — {n}) — e(n) > 0 

by (1’). Hence g is tolerable in H. 

Suppose that Z C V(K). If C’ ¢Z, then Zég = Zé and so xF,(n; Z) 
F,(n; Z) > 0 for every n € Z. If C’ € Z, then 
(3) Zin = 25 


where Z = (Z — {C’}) UC. By (1”) and (3), xF,(C’; Z) = F,(é; Z) > 0; 


and, by (3) and Lemma 2, 
g(C’) + |Zix| = |Cd| + | 25] > |(Z — {C’})é}, 


whence xF,(n; Z) > F,(n; Z — {C’}) > 0 for every » € Z — {C’}. Hence g is 
tolerable in K. 


Definitions. An edge i of N isa loop if Xt = Ah. If gis an N-function, a vertex 
t is g-critical if the set {£} is g-critical, that is, if g(é) = |{£}6|, and is g-safe if 
F,(&; {&}) > 0, that is, if g(é) < |{&}6|. A ome-edge-route is a route which has 
exactly one edge. If SC V(.V), an edge A is an exit of S if A% € S, AA & 8, 
and is an entry of Sif Ak € S, M € 8. If A C E(N), N — A will denote the 


\We give the names Lemma nA, Lemma mB to lemmas which themselves form part of the 
proof of Lemma n. 








162 Cc. ST. J. A. NASH-WILLIAMS 


subgraph of N defined by the relations V(.V — A) = V(N), E(N — 4A) = 
E(N) — A. 


LemMMA 6. If u and v are N-functions such that u —v =f on V(N) and 
u + v is tolerable, then N has a (u, v)-route-factor. 


Proof. Since Lemma 6 is trivially true for an oriented graph of order 0, 
it may be proved by induction on ord V. We shall therefore make the inductive 
hypothesis that Lemma 6 is true for all oriented graphs of lower order than V. 
Let u +v = g. If N has a loop X, then \ belongs to no cincture. Therefore g, 
being tolerable in NV, is tolerable in NV — {A}. It is also clear that fy_,,; = f = 
u — von V(.V). Therefore, by the inductive hypothesis, V — {A} has a (x, v)- 
route-factor, and hence so has N. We shall therefore henceforward assume 
that N is loopless. We shall consider separately the following two cases: (1) 
V(N) has a g-critical subset C such that |C| > 2 and |C| > 2; (II) V(N) has 
no such subset. 

Proof for Case |. Let the exits of C be Ai, As,...,A, and its entries be 
Apt» Ape2,---,Ar If we write Ne = K, u(C’) = p, v(C’) =r—p and 
g(C’) = |Cé|, then uw, v, and g are defined on all vertices of K and g = u+vp 
on V(K). By Lemma 5, g is tolerable in K. It is clear that u(C’) — v(C’) = 
fx(C’) and that fg =f =u—v on C; hence u —v = fg on V(K). Since 
|C| > 2, ord K < ord NV. Therefore, by the inductive hypothesis, K has a 
(u, v)-route-factor ®. Since u(C’) + v(C’) = r and Ay, Ao, ..., A, are the only 
edges incident with C’ in K, it is clear that \,, As, . . . , A, must be distributed 
in a one-to-one fashion amongst the r elements of ® which have C’ as an end- 
vertex; let R,; be that element of ® which includes \; among its edges. Then 
clearly R; is derivable from a route-sequence of the form C’, d,, s;, where s, is 
a route-sequence of C*. Clearly C’, \,, s; and hence also s; must be forwards- 
or backwards-directed according as C”’ is the tail or head respectively of \, 
in K, that is, according as i < p or i > p respectively. Moreover, if # — 
{Ri, Re, ..., R-} = A, then, since the A; are the only edges incident with 
C’ in K and X, € E(R,) (¢ = 1,2,...,7), it follows that each element of A 
is a route of C*. 7 

If we write u(C’) = r — p, v(C’) = p, an argument similar to that of the 
preceding paragraph, but using the hypothesis that |C| > 2 and the assertion 
concerning Vz in Lemma 5, shows that Nz has a (u, v)-route-factor A VU { R,, 
ea R,} such that the elements of A are routes of C* and, fori = 1, 2 


9 Meee ey 


r, R, is derivable from a route-sequence of the form §,, A;, C’, where &, is a 
route-sequence of C* and is forwards- or backwards-directed according as 
i < pori> p respectively. It is now not difficult to see that, if S;, is the 
route derived from the route-sequence §,, A,;, s;, then AU AU {S,, So... ,S,} 
is a (u, v)-route-factor of NV. 


Proof for Case Il. 


LEMMA 6A. A vertex & of N is g-critical if V(N) — {£} ts g-critical. 























DECOMPOSITION OF FINITE GRAPHS 163 


Proof. lf V(N) — {&} is g-critical, 
—g(n) +g. (V(N) — {&, n}) + | {E}| = 0 
for some 7 € V(V) — {&}. But 
—g(n) +g. (V(N) — t&, n}) + (&) = F,(n; V(N)) > O. 


Therefore g(=) > |{£}4|, that is, F,(&; {&}) < 0. Hence, since g is tolerable, 
F,(&; {€}) = 0 and so € is g-critical. 


CorOLLARY 6AA. In Case II, every non-empty g-critical cincture is of the 
form {&}6 for some g-critical vertex &. 


If — is a g-critical vertex, g(é) = |{£}6|, that is, since N is loopless, u(¢) + 
v(—~) = x(&) + e(€). But, by hypothesis, u(é) — v(&) = f(&) = x(&) — e(€). 
Hence u(&) = x(&) and v(&) = e(&). Hence, since J is loopless, the one-edge- 
routes in .V constitute a (u, v)-route-factor of NV if every vertex of N is g-critical. 
We may therefore assume that N has a g-safe vertex c. Since a is g-safe, 

l{o}s| > g(c) > |u(c) — v(c)| = |f(o)| 
by hypothesis. Therefore 
(4) x(e)>0, e(e) >0. 


LeMMA 6B. The vertex o has an entry \ and an exit wu such that no g-critical 
cincture includes both \ and wu. 


Proof. (Throughout this proof, the reader should bear in mind that JN is 
assumed to be loopless.) If ¢ is adjacent to two or more other vertices, it is 
easily seen from (4) that o has an entry A and an exit uw which join it to 
different vertices; since ¢ is g-safe and is the only vertex incident with both 
\ and yu, Corollary 6AA shows that no g-critical cincture includes both \ and 
u. We may therefore assume that ¢ is adjacent to at most one, and hence, by 
(4), to exactly one other vertex; let this vertex be r. Since ¢ is adjacent only to 
t, |{o, r}6| = |{r}5| — |{o}d|. Therefore 

— g(r) + g(o) + |{r}d| — |{o}d| = F,(7; {o, r}) > 0. 
But |{o}4| > g(c) since o is g-safe. Therefore |{r}5| > g(r). Hence r is also 
g-safe. But, by (4), we can select an entry A and an exit u of c. Since \, « must 
both join ¢, tr, which are both g-safe, Corollary 6AA again implies the required 
result. 

Since 


g=utveu-v=f=x-ezxte=d 


on V(N), gis congruential in V. Therefore, by Lemmas 6B and 4, g is tolerable 
in the oriented graph (M, say) obtained from N by fusion of \ and u at «. It 
is also clear that fy = f = u — von V(N) = V(M) and that ord M = ord N 
— 1. Therefore, by the inductive hypothesis, M has a (u, v)-route-factor, and 
it is easily seen that this is converted into a (u, v)-route-factor of N when we 
reverse the fusion of \ and uz at o. 











164 C. ST. J. A. NASH-WILLIAMS 


LemMaA 7. If N has a decomposition of the form ®\ 0, where ® is a (u, v)- 
route-factor of N and ® 1s a set of closed routes each of which has a vertex in com- 
mon with some element of ©, then N has a (u, v)-decomposition. 


Proof. Let ®= {R,, Ro,...,R,-}, and let 06=06,U06,.U...U6, 
where the 9, are disjoint and each element of 6; has a vertex in common with 
R,. If S; is the union of R; and the elements of 0,, it is easily seen that S, 
is an open route with the same head and tail as R;. Hence {.S;, Se,...,S,! 
is a (u, v)-decomposition of V. 


Proof of Theorem 2. The necessity of the first condition follows from Lemma 
1, and the necessity of the other two is obvious. Conversely, suppose that 
these three conditions are satisfied. Then, by Lemma 6, NV has a (u, v)-route- 
factor ®. If T is the union of the elements of %, then clearly fy = u — v on 
V(T) and u =v =0 on V(N) — V(7). But f = u —v on V(N) by hypo- 
thesis. Therefore V — E(T) is quasi-symmetrical. Therefore, by (1, chapter 
11, Theorem 7), every component of V — E(T) is a closed route. Moreover, 
since (u + v). V(H) > 0 for each component H of N, each component of V 
contains an element of ® and hence each component of N — E(T) has a 
vertex in common with an element of ®. Therefore, by Lemma 7 (with 6 
taken to be the set of components of V — E(T)), N hasa (u, v)-decomposition. 


3. Proof of Theorem 1. 


LEMMA 8. Every unoriented graph has an orientation in which f(t) = 0 for 
each even vertex & and f() = + 1 for each odd vertex é. 


Proof. Let U be a given unoriented graph. By Corollary 3B, the number of 
odd vertices of U is even; let it be 27. Then U can be converted into an Eulerian 
unoriented graph H by the addition of r new edges joining its odd vertices in 
pairs.? H, being Eulerian, has by (1, p. 30, Il. 4-9), a quasi-symmetrical 
orientation, and this clearly induces in U an orientation of the required type 


Proof of Theorem |. The necessity of the condition that g be tolerable follows 
from Lemma 1, and the necessity of the remaining conditions is obvious. 
Conversely, let the conditions of Theorem 1 be satisfied, and let V be an 
orientation of U satisfying the condition of Lemma 8. Write u = 3(g + f), 
v = 4(g — f), where f denotes flux in V. Then, by Theorem 2, N has a (u, 2)- 
decomposition, and hence U has a g-decomposition. 

4. Generalizations. 

Definitions. A semi-oriented graph is a quintuple S = (U,r,¢, p,q) such 
that U is an unoriented graph, r, ¢ are disjoint sets and p, g are mappings of 
r Ueinto V(U), E(U) respectively, subject to the condition that each edge A 
of U is the image under g of exactly two elements of r  e and that, if these 
elements are ¢, «’, then A joins ep to ep in U. Vertices and edges of L’ are 


*This procedure is suggested by the proof of (1, chapter 11, Theorem 4). 

















r 








DECOMPOSITION OF FINITE GRAPHS 165 


called vertices and edges of S respectively, and elements of r Ue are called 
hinges of S. A vertex & (edge A) of U is incident with a hinge «if ep = & (eg = A). 
Two hinges are opposed if one of them belongs to r and the other to e. If 
& € V(U), f(é) will denote |@\r| — | C\e|, where @ is the set of those 
hinges of S which are incident with & An open route-sequence of S is a finite 
sequence 


(5) £0, €1, Aty €1y E1y €2, Az, Ea, Ea, €3, ~~~ » Any Env En 


such that &o, Ax, £1, Ae, ..-, An, & is an open chain-sequence of U, the e, and 
é, are hinges of S, the relations 


esp = Ei, Exc = Fs, € = EQ = Ay i FE; 


hold for i = 1,2,...,m and €, €:4; are opposed for i = 1,2,..., n— 1. 
(The last condition is vacuous if m = 1.) The vertex £o [,] is a tail or head 
of (5) according as ¢, [é,| belongs to r or e respectively. (Thus an open route- 
sequence of S may have two tails, two heads, or one tail and one head.) An 
open route of S is a subgraph of S derivable from an open route-sequence of 
S. (We shall leave the reader to guess the definitions of subgraph of S, derivable 
and certain other terms relating to semi-oriented graphs from corresponding 
definitions given for unoriented and oriented graphs.) If R is an open route of 
S, £isa vertex of R, and s is any open route-sequence from which R& is derivable, 


then clearly fg(¢) = 1 if and only if € is a tail of s and fg(é) = —1 if and only 
if £ is a head of s; we shall therefore call ~ a tad of R if fe(¢) = l and a head 
of R if fe(é) = —1. A decomposition of S is a set of edge-disjoint subgraphs of 


S whose union is S. If u,v are U-functions, a (u, v)-decomposition of S is a 
decomposition D of S into open routes such that each vertex ¢ is a tail of 
exactly u(£) and head of exactly v(é) elements of D. Semi-oriented graphs are 
virtually a generalization of oriented graphs, since an oriented graph may be 
regarded as a semi-oriented graph in which each edge is incident with two 
opposed hinges. A semi-orientation of an unoriented graph U, is a semi- 
oriented graph having U; as its first constituent element. 
Theorem 2 admits the following generalization: 


THEOREM 3. Let S = (U,r, ¢, p, g) be a semi-ortented graph and u, v be 
U-functions. Then S has a (u, v)-decomposition tf and only if u + v is tolerable, 
u —v =f on V(U) and (u +2). V(A) > O for each component H of VU. 


The proof of Theorem 3 is a fairly easy adaptation of that of Theorem 2; 
but we refrained from giving the argument in this more general form to avoid 
obscurity. It may be remarked, however, that Theorem 1 is more readily 
deducible from Theorem 3 than from Theorem 2, since Lemma 8 becomes 
trivial if, in its statement, ‘‘an orientation”’ be replaced by ‘‘a semi-orientation.”’ 


Definitions. A partition of a set A is a set of disjoint subsets of A whose 
union is .1. If P isa partition of V(.V), an N-function g is P-tolerable if 


g.(SO\T) <g.(S— T) + |S 











166 C. ST. J. A. NASH-WILLIAMS 


for every pair S, T of subsets of V(.V) such that 7 € P. A set # of open routes 
of N is P-restricted if no element of ® has both its end-vertices in the same 
element of P. 


THEOREM 2’. Let P be a partition of V(N) and u, v be N-functions. Then N 
has a P-restricted (u, v)-decomposition if and only if u + vis P-tolerable,u —v = 
fon V(N), and (u + v) . V(A) > 0 for each component H of N. 


Theorem 2’ is a generalization of Theorem 2, since it clearly reduces to 
Theorem 2 when P is taken to be the partition of V(.V) into subsets of order 
1. The proof of Theorem 2’, which we shall not give in detail, consists in 
applying Theorem 2 to an oriented graph NV, and N,-functions 1, v; defined 
as follows. N, is obtained from N by adding, for each 7 € P, a new vertex 
az and, for each pair £, JT such that § € T € P, u(é) new edges with tail ar 
and head é and v(¢) new edges with tail and head ar. (Thus |P| new vertices 
and (u + v) . V(.V) new edges are added altogether.) We write u;(a7) = u. T, 
vi(ar) =v. 7 and u, = v; = Oon VN). 

Theorems 1 and 3 admit corresponding generalizations to ‘‘P-restricted” 
decompositions. 

Since this work was a part of my thesis, I should like gratefully to acknowl- 
edge the help and guidance of my Research Supervisors, Professor D. Rees, 
Professor N. E. Steenrod, and Dr. S. Wylie, and the following financial sup- 
port; grants from the Department of Scientific and Industrial Research, the 
University of Cambridge and Trinity Hall, an Amy Mary Preston Read 
Scholarship (awarded by the University of Cambridge), a J. S. K. Visiting 
Fellowship (awarded by the University of Princeton), and a Fulbright Travel 
Grant. 


REFERENCE 


1. D. Kénig, Theorie der endlichen und unendlichen Graphen (Leipzig, 1936, and New York, 
1950). 


University of Aberdeen 








al 











HOMOTOPY AND ISOTOPY PROPERTIES OF 
TOPOLOGICAL SPACES 


SZE-TSEN HU 


1. Introduction. The most important notion in topology is that of a 
homeomorphism f: X — Y from a topological space X onto a topological space 
Y. If a homeomorphism f: X — Y exists, then the topological spaces X and Y 
are said to be homeomorphic (or topologically equivalent), in symbols, 


X = Y. 


The relation = among topological spaces is obviously reflexive, symmetric, 
and transitive; hence it is an equivalence relation. For an arbitrary family F 
of topological spaces, this equivalence relation = divides F into disjoint equiva- 
lence classes called the topology types of the family F. Then, the main problem 
in topology is the topological classification problem formulated as follows. 

The topological classification problem: Given a family F of topological spaces, 
find an effective enumeration of the topology types of the family F and exhibit 
a representative space in each of these topology types. 

A number of special cases of this problem were solved long ago. For example, 
the family of Euclidean spaces is classified by their dimensions and the family 
of closed surfaces is classified by means of orientability and Euler characteristic. 
However, the problem is far from being solved; in fact, the topological classi- 
fication of the family of three-dimensional compact manifolds still remains an 
outstanding unsolved problem. 

To overcome the difficulty of the topological classification problem, topo- 
logists introduced weaker equivalence relations, namely, the homotopy and 
isotopy equivalences, which would give rise to larger but fewer classes of spaces 
than the topology types. 

A continuous map f: X — Y is said to be a homotopy equivalence provided 
that there exists a continuous map g: Y ~X such that the compositions 
gof and fog are homotopic to the identity maps on X and Y respectively. 
Two topological spaces X and Y are said to be homotopically equivalent (in 
symbol, X ~ Y) if there exists a homotopy equivalence f: X — Y. 

It is easily verified that the relation ~~ among topological spaces is reflexive, 
symmetric, and transitive; hence it is an equivalence relation. For any given 
family F of topological spaces, this equivalence relation ~ divides F into dis- 
joint equivalence classes called the homotopy types of the family. Analogous to 


Received November 16, 1959. This research was supported by the United States Air Force 
through the Air Force Office of Scientific Research and Development Command, under 
Contract No. AF 49(638)-179. Reproduction in whole or in part is permitted for any purpose 
of the United States Government. 


167 








168 SZE-TSEN HU 


the topological classification problem, one can formulate the homotopy classi- 
fication problem in the obvious fashion. 

To introduce the notion of isotopy equivalence, let us first recall the defini- 
tion of an imbedding. A continuous map f: X — Y is said to be an imbedding 
provided that f is a homeomorphism of X onto the subspace f(X) of Y. 

A homotopy h,: X — Y, (¢ € J), is said to be an isotopy if, for each ¢ € J, 
h, is an imbedding. Two imbeddings f/, g: X — Y are said to be isotopic if there 
exists an isotopy h,: X — Y, (¢ € J), such that Ap = f and A; = g. 

An imbedding f: X — Y is said to be an isotopy equivalence if there exists an 
imbedding g: Y — X such that the composite imbeddings go /f and fog are 
isotopic to the identity imbeddings on X and Y respectively. Two topological 
spaces X and Y are said to be isotopically equivalent (in symbol, X = Y) if 
there exists an isotopy equivalence f: X — Y. 

The relation + among topological spaces is obviously an equivalence rela- 
tion. For any given family F of topological spaces, this equivalence relation 
> divides F into disjoint equivalence classes called the isotopy types of the 
family. One can formulate the isotopy classification problem in the obvious 
fashion. 

By the definitions given above, it is clear that every homeomorphism is an 
isotopy equivalence and that every isotopy equivalence is a homotopy equiva- 
lence. 

Examples in the sequel will show that the converses are not always true. 
Hence, for any given family F of topological spaces, every topology type of 
F is contained in some isotopy type of F, and every isotopy type of F is con- 
tained in some homotopy type of F. Consequently, the topological classifica- 
tion problem can break into three steps as follows: 

Step 1. Homotopy classification. Determine effectively all of the homotopy 
types of the family F. 

Step 2. Isotopy classification. For each homotopy type a of the family F, 
determine effectively all of the isotopy types of the family a. 

Step 3. Topological classification. For each isotopy type 8 of the family F, 
determine effectively all of the topology types of the family 8 and exhibit a 
representative space in each of the topology types. 

In order to carry out the three steps of the topological classification problem 
for a given family F of topological spaces, one must make use of the various 
properties of spaces which are preserved by homotopy equivalences, isotopy 
equivalences, and homeomorphisms respectively. These properties are called 
the homotopy properties, the isotopy properties, and the topological properties 
respectively. It follows that every homotopy property is an isotopy property 
and that every isotopy property is a topological property. Examples in the 
sequel will show that the converses of these implications do not always hold. 

The main purpose of the present paper is to give general tests for homotopy 
and isotopy properties in terms of hereditary and weakly hereditary properties 
with the elementary properties in general topology as illustrations. These will 





Si 





HOMOTOPY AND ISOTOPY PROPERTIES 169 


be given in §§ 2 and 3. In the final section of the paper, we will describe a 
general method of constructing new homotopy and isotopy properties out of 
old ones as a striking and profound synthesis of various isolated known 
results. 


2. Homotopy properties. A property P of topological spaces is called a 
homotopy property provided that it is preserved by all homotopy equivalences. 
Precisely, P is a homotopy property if and only if, for an arbitrary homotopy 
equivalence f: X — Y, that X has P implies that Y also has P. If a homotopy 
property P is given in the form of a number, a set, a group, or some other 
similar object, P is said to be a homotopy invariant. 

Some of the elementary properties in general topology are homotopy proper- 
ties. As examples, one can easily prove the following assertions. 


PROPOSITION 2.1. Contractibility is a homotopy property of topological 
spaces. 


PROPOSITION 2.2. The cardinal number of components of a topological space 
X is a homotopy invariant. 


COROLLARY 2.3. Connectedness is a homotopy property of topological spaces. 


PROPOSITION 2.4. The cardinal number of path-components of a topological 
space X is a homotopy invariant. 


COROLLARY 2.5. Pathwise connectedness is a homotopy property of topological 
Spaces. 


Nevertheless, most properties studied in general topology are not homotopy 
properties. To demonstrate this fact, let us first introduce the notion of weakly 
hereditary properties. 

A property P of topological spaces is said to be hereditary if each subspace 
of a topological space with P also has P; it is said to be weakly hereditary if 
every closed subspace of a topological space with P also has P. For examples, 
the following properties of a topological space X are weakly hereditary: 

(A) X isa 7;-space, that is, every point in X forms a closed set of X. 

(B) X is a Hausdorff space. 

(C) X isa regular space. 

(D) X isa completely regular space. 

(E) X isa discrete space, that is, every set in X is open. 

(F) X is an indiscrete space, that is, the only open sets in X are the empty 
set |_| and the set X itself. 

(G) X isa metrizable space. 

(H) The first axiom of countability is satisfied in X, that is, the neighbour- 
hoods of any point in X have a countable basis. 

(1) The second axiom of countability is satisfied in X, that is, the open 
sets of X have a countable basis. 











170 SZE-TSEN HU 


(J) X can be imbedded in a given topological space Y. 

(K) For a given integer n > 0, dim X < n. Here, the inductive dimension 
dim X is defined as follows: dim X = —1 if X is empty, and dim X < n if 
for every point p € X and every open neighbourhood U of p there exists an 
open neighbourhood V C U of p such that dim 86V < m — 1, where AV 
denotes the boundary V \V of V in X (2, p. 153). 


(L) X isa normal space. 

(M) X is a compact space. 

(N) X isa Lindelof space, that is, every open covering of X has a countable 
subcovering. 

(O) X is a paracompact space. 

(P) X isa locally compact space. 

(Q) For a given integer m > 0, Dim X < n. Here, the covering dimension 
Dim X is defined as follows: Dim X < n if every finite open covering of X 
has a refinement of order < m (2, p. 153). 

The first eleven properties (A)—(K) listed above are also hereditary. 

A topological space X is said to be a singleton space if X consists of a single 
point. Obviously, every singleton X has all of the properties (A)—(Q). On the 
other hand, none of these properties prevails in all topological spaces. Hence 
we deduce, as a consequence of the following theorem, the fact that none of 
these properties (A)—(Q) is a homotopy property. 


THEOREM 2.6. Let P be a weakly hereditary topological property such that every 
singleton space has P and suppose that there exists a topological space X which 
does not have P. Then P is not a homotopy property. 


Proof. Let X be a topological space which does not have P. Consider the 
cone C(X) over X which is the quotient space obtained by identifying the 
top X X 1 of the cylinder X X J to a single point v, called the vertex of the 
cone C(X). Then the space X may be identified with bottom X X 0 of the 
cone C(X) and hence X becomes a closed subspace of C(X). Since P is a 
weakly hereditary property which X does not have, C(X) cannot have P. 
On the other hand, it is well known that the inclusion map i:» C C(X) isa 
homotopy equivalence. Since the singleton space v has P but C(X) does not 
have P, P is not a homotopy property. This completes the proof of (2.6). 


Although most of the properties studied in general topology are not homo- 
topy properties as shown by the foregoing theorem, it is well known that almost 
all invariants studied in algebraic topology are homotopy invariants, namely, 
the homology groups, the homotopy groups, etc. 

For topological spaces which are homotopically equivalent to CW-com- 
plexes, Postnikov, in his celebrated work (3), gave a complete system of 
homotopy invariants, now called the Postnikov system of the space. Any pair 
of these spaces are homotopically equivalent if and only if their Postnikov 





syste 


spac 
is to 


3. 
isoto 
Prec 
equi 
proj 
simi 

\ 
ties. 
pro} 


T 
pro 


ar 


in 





—~ Ve 


al 








HOMOTOPY AND ISOTOPY PROPERTIES 171 


systems are isomorphic. Hence, the homotopy classification problem of these 
spaces has been solved by Postnikov at least theoretically although his process 
is too complicated to be practicable. 


3. Isotopy properties. A property P of topological spaces is called an 
isotopy property provided that it is preserved by all isotopy equivalences. 
Precisely, P is an isotopy property if and only if, for an arbitrary isotopy 
equivalence f: X — Y, that X has P implies that Y also has P. If an isotopy 
property P is given in the form of a number, a set, a group, or some other 
similar object, P is said to be an isotopy invariant. 

Most of the elementary properties in general topology are isotopy proper- 
ties. For example, the eleven properties (A)—(K) listed in § 2 are isotopy 
properties in immediate consequence of the following theorem. 


THEOREM 3.1. Every hereditary topological property of spaces is an isotopy 
property. 


Proof. Let P be any hereditary topological property of spaces. Assume that 
f:X — Y is an isotopy equivalence and that the space X has the property P. 
It suffices to prove that Y also has P. 

By definition of an isotopy equivalence, there exists an imbedding g: Y + X 
such that the composed imbeddings g o f and f 0 g are isotopic to the identity 
imbeddings on X and Y respectively. The image g( Y) is a subspace of X. Since 
P is hereditary, this implies that g(Y) has the property P. As an imbedding, 
gis a homeomorphism of Y onto g(Y). Since P is a topological property and 
g(Y) has P, it follows that Y also has P. This completes the proof of (3.1). 


THEOREM 3.2. The inductive dimension dim X of a topological space X is an 
isotopy invariant. 


Proof. Let f:X — Y be any given isotopy equivalence and assume that 
dim X = m, dim Y = an. 
It suffices to prove that m = n. 

Since dim X < m and f: X — Y is an isotopy equivalence, it follows from 
the fact that the property (XK) of § 2 is an isotopy property that dim Y < m. 
Hence, we obtain nm < m. By considering any isotopy inverse g: Y — X of f, 
we can also prove that m < n. Hence m = nm and (3.2) is proved. 

Not all topological properties of spaces are isotopy properties. Examples 
are given by the following propositions. 

PROPOSITION 3.3. Compactness is not an isotopy property of topological 


Spaces. 


Proof. Let Y denote the closed unit interval J = [0, 1] and X the open unit 
interval (0, 1) which is the interior of Y. It is well known that Y is compact 








172 SZE-TSEN HU 


but X is non-compact. Hence, it suffices to prove that the inclusion i: X C Y 
is an isotopy equivalence. 
For this purpose, let 7: Y — X denote the imbedding defined by 


j® = 4+ 1), (0<t <1). 


It remains to prove that the composed imbeddings j 0 i and i 0 j are isotopic 
to the identity imbeddings on X and Y respectively. 

Define an isotopy k,: Y — Y, (¢ € J), by taking 

k,(y) = 3(t + 3y — 2ty) 
for each ¢ € J and each y € Y = IJ. Since k,(X) C X for each t € J, k, also 
defines an isotopy h,: X — X, (t¢ € J). 

Since ho and ko are the identity maps on X and Y respectively and since 
hy, = jotand k,; = 10 j, it follows that j 0 i and i 0 j are isotopic the identity 
imbeddings. This completes the proof of (3.3). 

Since the open interval Y = (0, 1) is homeomorphic to the real line R, we 
have also proved the following corollary. 


COROLLARY 3.4. The unit interval I = [0, 1] and the real line R are isotopically 
equivalent. 


Since the product of an arbitrary family of isotopy equivalences is clearly 
also an isotopy equivalence, we have the following generalization of (3.4). 


COROLLARY 3.5. For any cardinal number a, the topological powers I* and R* 
are isotopically equivalent. 


In particular, if @ is a finite integer n — 0, the n-cube J" and the Euclidean 
n-space R" are isotopically equivalent. 

On the other hand, if a is infinite, R* is not locally compact while /* is com- 
pact and hence locally compact. This proves the following proposition. 


PROPOSITION 3.6. Local compactness is not an isotopy property of topological 
spaces. 


4. Homotopy functors and isotopy functors. By a covariant homotopy 
functor, we mean an operator ¢ which assigns to each topological space X a 
topological space ¢(X) and to each continuous map f: X — Y a continuous 
map 


o(f): o(X) > (VY) 


satisfying the following three conditions: 
(HF 1) ¢ preserves identity, that is, if f is the identity map so is ¢(f). 
(HF2) ¢ preserves composition, that is, if f: X — Y and g: Y — Z are con- 
tinuous maps then we have 


o(gof) = o(g)o o(/). 








col 


Sa 


ic 


& 


Dn 








HOMOTOPY AND ISOTOPY PROPERTIES 173 


(HF3) @ preserves homotopy, that is, if the family h,: X — Y, (t € J), of 
continuous maps is a homotopy, so is the family 


o(h,): o(X) > o(Y), (¢ € J). 


If, in the preceding definition of a homotopy functor ¢, we have 
o(f): o(Y) > o(X), oof) = o(f) o o(g), 


then the operator ¢ is called a contravariant homotopy functor. 

Similarly, by a covariant isotopy functor, we mean an operator which assigns 
to each topological space X a topological space ¥(X) and to each imbedding 
f: X — Y an imbedding 


Vf): W(X) — WY) 


satisfying the following three conditions: 
(IF 1) ¥ preserves identity, that is, if f is the identity imbedding so is ¥(/). 
(IF2) y preserves composition, that is, if f:X — Y and g: Y ~ Z are im- 
beddings then we have 


Vigof) = ¥(g) oV(/). 


(IF3) ¥ preserves isotopy, that is, if the family k,: X — Y, (¢ € J), of im- 
beddings is an isotopy, so is the family 


¥(k,): V(X) ~ WY), (¢ € J). 


One can define contravariant isotopy functors by reversing the direction of the 
imbeddings ¥(/) and obvious modifications in (IF2) and (IF3). 
Examples of homotupy and isotopy functors: 


Example 1. Topological powers. Let n be any positive integer. Define an 
operator ¢ as follows. For each topological space X, let ¢(X) denote the topo- 
logical mth power X", that is, the topological product of m copies of the space X. 
For each continuous map f: X — Y, let ¢(f) stand for the mth power f*: X" — 
Y" of f defined by 


Py > ok ep = Css 6s 5 ab 


Then the conditions (HF1)-(HF3) can easily be verified and hence ¢ is a 
covariant homotopy functor. Furthermore, if f:X — Y is an imbedding, 
¢(f) = f" is clearly also an imbedding. Hence, the restriction ¥ of ¢ on spaces 
and imbeddings is a covariant isotopy functor. 

More generally, let G be a subgroup of the symmetric group S of the integers 
1,...,m, that is, S is the group of all permutations of the m integers 1, ... , n. 
Then G operates on the topological power X" by permuting the factors of X". 
Let ¢¢(X) denote the orbit space X"/G. Since the operators in G obviously 
commute with the continuous maps /": X" — Y", each /* induces a continuous 
map ¢¢6(f):¢e(X) > ¢¢(Y). It follows that ¢¢ is a covariant homotopy 











174 SZE-TSEN HU 


functor and its restriction ~¢ on spaces and imbeddings is a covariant isotopy 
functor. 


Example 2. Residual functors. Let n be an integer greater than 1. Define an 
operator y as follows. For each topological space X, let ¥(X) denote the 
residual space X"\d(X) obtained by deleting the diagonal d(X) from the 
nth power X”. If f: X — Y is an imbedding, the mth power /” carries ¥(X) into 
¥(Y) and hence defines an imbedding y¥(f):¥(X) —~y(Y). The conditions 
(IF1)—(IF3) are obviously satisfied and hence y is a covariant isotopy functor. 
This isotopy functor y is called the mth residnal functor and is denoted by 
Ry. 

Let G be a subgroup of the symmetric group of m integers 1,..., m. Then 
G also operates on the residual space ¥(X). Let ~e(X) denote the orbit space 
¥(X)/G. Then y¥(f) induces an imbedding We(f): ¥e(X) > ve(Y) for each 
imbedding f: X — Y. Thus, We is also a covariant isotopy functor. 


Example 3. Mapping spaces. Let T be a given Hausdorff space. Define an 
operator ¢ as follows. For each topological space X, let ¢(X) stand for the 
space Map (7, X) of all continuous maps from T into X with the compact- 
open topology. For each continuous map f: X — Y, let 


@(f): Map(7, X) — Map(7, Y) 


denote the function defined taking 
[o(/)](é) =fo€ 


for each §: T7— X in Map(T, X). One can verify that ¢(f) is a continuous 
map and that the conditions (HF1)—(HF3) are satisfied. Hence ¢ is a covariant 
homotopy functor. Furthermore, if f: X — Y is an imbedding, so is ¢(f). This 
implies that the restriction y of @ on spaces and imbeddings is a covariant 
isotopy functor. 


Example 4. Enveloping functors. Let n be any positive integer greater than 
1. Define an operator y as follows. For each topological space X, consider as in 
Example 2 the ath power X” and identify X with the diagonal d(X) in X". 
Then, ¥(X) stands for the subspace of Map(J, X*) consisting of the con- 
tinuous paths o: J — X” such that o(t) © X if and only if ¢ = 0. For each im- 
bedding f: X — Y, it follows from the preceding examples that the imbedding 
f": X* —» Y" induces an imbedding of Map(J, X") into Map (J, Y") which 
carries ¥(X) into ¥( Y) and hence defines an imbedding 


¥(f):¥(X) ~ ¥(Y). 


One can easily verify that the conditions (IF1)—(IF3) are satisfied and hence 
¥ is a covariant isotopy functor. This isotopy functor y is called the nth 
enveloping functor and is denoted by E,. For the remaining case nm = 1, we may 
define E,(X) to be the subspace of Map(J, X) consisting of the continuous 
paths o: J — X such that o(t) = o(0) if and only if ¢ = 0. 








——OO 


— 





F 


moc 


‘? 


w 








eee 


HOMOTOPY AND ISOTOPY PROPERTIES 175 


For each subgroup G of the symmetric group of m integers 1, .. . , m, similar 
modifications may be made as in Examples 1 and 2. 
The usefulness of these functors can be seen from the following two theorems. 


THEOREM 4.1. Jf @ is a homotopy functor, then every homotopy property of 
o(X) induces a homotopy property of X. 


Proof. Let P be an arbitrary homotopy property. Assume that f: X — Y 
is a homotopy equivalence and that ¢(X) has P. We have to prove that ¢( Y) 
must also have P. For this purpose, it suffices to show that ¢(/) is also a homo- 
topy equivalence. 

Let g: Y > X be a continuous map such that the compositions go /f and 


fog are homotopic to the identity maps on X and Y respectively. Then there 


exist homotopies h,: X — X and k,;: Y— Y, (¢ € J), such that Ao = gof, 
ko = fog, and A, k, are identity maps. By (HF3), ¢(4,) and ¢(&,) are homo- 
topies. By (HF2), (Ao) and ¢(o) are the two compositions of ¢(f) and ¢(g). 
By (HF1), (41) and $(&;) are the identity maps on ¢(X) and ¢(Y) respec- 
tively. Hence ¢(f) is a homotopy equivalence. This completes the proof of 
(4.1). 


For example, let us take P to be the pathwise connectedness. For each 
homotopy functor ¢, we may define a new homotopy property which might be 
called the ¢-pathwise connectedness as follows. A topological space X is said 
to be ¢-pathwise connected provided that ¢(X) is pathwise connected. By 
(4.1), we know that ¢-pathwise connectedness is a homotopy property of 
topological spaces. In particular, if @ is the homotopy functor constructed in 
Example 3 with T = S' the unit 1l-sphere, then one can easily see that a topo- 
logical space X is $-pathwise connected if and only if it is simply connected. 
Thus, this gives us the well-known fact that simple connectedness is a homo- 
topy property of topological spaces. 

Analogously, we have the following 


THEOREM 4.2. If ¥ is an isotopy functor, then every isotopy property of ~(X) 
induces an isotopy property of X; in particular, every homotopy property of 
¥(X) induces an isotopy property of X. 


The proof of (4.2) is similar to that of (4.1) and hence omitted. 


CorROLLARY 4.3. Jf y is an isotopy functor, then all homotopy invariants of 
¥(X), such as the homology groups of ¥(X), are isotopy invariants of X. 


By suitable choices of the isotopy functors y, (4.3) provides many new 
isotopy invariants of topological spaces which enable us to solve the problems 
in isotopy theory. For example, let us consider a family of topological spaces 

? 
Wo (p > 0,g > 0), 


where W,” denotes the linear graph obtained by attaching p small triangles 





176 SZE-TSEN HU 
at each end of a line-segment ad and joining the two ends of ad by g broken 
lines acyb, R = 1,2,...,¢. Let 
r=2p+49. 
Since the Euler characteristic of W,? is 


x(Wt) =1-—,, 


it follows that the homotopy classification problem of this family of spaces 
{W?:p > 0, g > 0} is solved by the homotopy invariant r = 2p +4, 
Precisely, W,? and W,’ are homotopically equivalent if and only if 


2p +¢ = 2s +12. 


For the isotopy classification of the spaces W,”? with the same r = 2p +4, 
let us use the second residual functor R». In (1), it has been computed that the 
two-dimensional homology group of R2(W,”) is a free abelian group of rank 
2p? and the one-dimensional homology group of R:2(W,”) is a free abelian 
group of rank 6p* + 4pq + g? + 26 + ¢ — 1 for all W,? with 26 +¢>0. 


This solves the isotopy classification problem for the spaces W,’. 


REFERENCES 


. S. T. Hu, Isotopy invariants of topological spaces, Technical Note, AFORS TN59-236, 
AD212006. Also to appear in the Proceedings of the Royal Society, England. 

. W. Hurewicz and H. Wallman, Dimension Theory (Princeton Mathematical Series, No. 4 
[Princeton University Press, 1941)}). 

. M. M. Postnikov, Investigations in homotopy theory of continuous mappings I, II, Trudy 
Mat. Inst. Steklov, No. 46, Izdat. Akad. Nauk SSSR (Moskow, 1955). Amer. Math. 
Soc. Translations, Series 2, 7 (1957). 

. A. Shapiro, Obstructions to the imbedding of a complex in a euclidean space I, Ann. Math., 
66 (1957), 256-269. 

. W. T. Wu, On the realization of complexes in Euclidean spaces I-III, Acta Math. Sinica, § 
(1955), 505-552; 7 (1957), 79-101; 8 (1958), 79-94. 

- ——— On the imbedding of polyhedra in Euclidean spaces,'8ull. Polon. Sci. Cl. III, 4 (1956), 
573-577. 


University of California at Los Angeles. 








